In my experience, the off the shelf LLMs (e.g. ChatGPT) do a pretty poor job with assembly, they can not reason about the stack or stack frames well.
I think your job will be the same with or without AI. Figuring out the data structures and data types a function is operating on and naming variables.
What are you reverse engineering for? For example, getting a full compilable decompilation has different goals than finding vulnerabilities or patching a bug.
On the commercial side, IDA / HexRays [2] is very strong for C-like decompilation. If you're looking at Go, Rust, or even C++ it is going to be a little bit more messy. As other commenters have said, you'll work function-by-function and it is expensive, though the free version does have decompilation (F5) for x86 and x64 (IIRC).
Binary Ninja [3] (no affiliation) is the coolest IMO, they have multiple intermediate representations they lift the assembly through. So you get like assembly -> low level IL -> medium level IL -> high level IL. There are also SSA forms (static single assignment) that can aid in programmatic analyses. The high level IL is very readable but makes no effort to be compilable as a programming language. That being said, Binary Ninja has implemented different "views" on the HLIL so you can show it as pseudo-C, Rust, etc. There is a free online version and the commercial version is cheaper than IDA but still expensive. Good Python API, good UI.
Ghidra [4] is the RE framework released by NSA. It is free and open source. It supports a ton of niche architectures. This is what most people use. I think the UI is awful, personally. It has a decompiler, the results are OK. They have an intermediate representation (P-Code) and plugins are in Java (since it is written in Java). I haven't worked much with it.
Most online decompilations you see for old games are likely using Ghidra, some might be using IDA. This is largely a manual process of doing a function at a time and building up the mental map of the program and how things interact.
Also worth mentioning are lifters. There were a few projects that aimed to lift assembly to LLVM IR (compiler framework's intermediate representation), with the idea being that then all your analyses could be written over LLVM IR as a lingua franca. Since it is in LLVM IR, it would be also recompilable and retargetable. [5][6]
1. https://reverseengineering.stackexchange.com/questions/2603/...
2. https://hex-rays.com/ida-free
I like to give it bomb executables (reverse engineering challenges) to test it.
For example I have a minified javascript file, way obfuscated. I can paste the code and make it break down the initial structure. And then I tell it which parts to focus on and which parts to dig in deeper.
RevEng.ai, linked a few times already, discusses their approach here: https://blog.reveng.ai/training-an-llm-to-decompile-assembly...
Is it actually legal to decompile a game engine from executables/dll files, write new sources by making sense of the output and rewriting it such that it can be compiled targeting modern APIs?
I feel like that must be illegal
Game RE communities also have all sorts of neat utilities for decompiling large cpp binaries. Skyrim’s community is pretty active with ghidra/ida.
Guessing you’re not lucky enough to have a PDB?
If you are able to run the program and collect traces, that will help a ton.
LLM won't help you much if u can't understand what it's talking about.
Manual way is, given ELF (linux executable format) somexe,
$ strings somexe
$ objdump -d somexe
$ objdump -s -j .ro data somexe
then look+ponder over the results.
and/or running ghidra (as mouse'd UI) over it.. which may help somewhat but not 100%
Have in mind, that objdump and ghidra have opposite ways of showing assembly transfer/multi-operand instructions - one has mov dest,target , other has mov target,dest - for same code.
no idea on (recent) windoze front. IDA ?