Submitted by vintergroena t3_123asbg in MachineLearning
Smallpaul t1_jdu38fb wrote
Decompilers already exist though.
currentscurrents t1_jdu3vwq wrote
Yeah, but they're hand-crafted algorithms and produce code that's hard to read.
ultraminxx t1_jdu7uz8 wrote
that said, it might be also a good approach to preprocess the input with a classical algorithm and then train a model on refactoring that decompiled code, so it becomes more readable
currentscurrents t1_jdvxga6 wrote
Possibly! But it also seems like a good sequence-to-sequence translation problem, just line up the two streams of tokens and let the model figure it out.
s0n0fagun t1_jdu975r wrote
That depends on the language/compiler used. Java and C# have decompilers that turn out great code.
currentscurrents t1_jdvxu6g wrote
Those languages don't compile to machine code, they compile to a special bytecode that runs in a VM.
bubudumbdumb t1_jdu90gu wrote
Friends working in rev.ng told me that it's very difficult to decompile to the original high level structures actually used in the source code. Maybe C have a few ways to code a loop but c++ has many and figuring out the source code from assembly is very hard to achieve with rule based systems.
Maykey t1_jdv3ft8 wrote
And there is already OpenAI-based plugin for them
Viewing a single comment thread. View all comments