Monday, June 25, 2007

Hacking through the night (Status Update)

It's 5:30 over here. I am somewhat tired after having spent the night hacking on the first decompiler prototype for SCUMM v5 (antipasto "Dirty D") and figured I might just as well whip up a quick status update on my blog. These past few weeks I have been quite busy with university work and personal issues and hence, even though I started writing Dirty D earlier than planned, I find myself barely within schedule now. There are ~10 days until the first prototype "milestone" and my slacker sense alarms me of more sleepless nights ahead. I still have to do the first svk push to my tools branch in the ScummVM repository, but that will happen later today once I am on campus (unless I am run over by a bus or abducted by aliens). Anyway, not too long ago I talked to my mentor about the general design of antipasto. Let me give a quick summary:


The decompiler is planned to roughly consist of two parts.

The first encompasses general utilities to be used to extract/process bytecode from game scripts and bytecode dependent backends making use of these utilities to produce a uniform temporary opcode listing format independent of bytecode variant in structure. All control flow manipulating opcodes of a bytecode variant will have to be replaced by semantically identical counterparts of the temporary format during processing.

The second part holds the control flow analysis functionality of the decompiler. It is responsible for transforming a partial decompilation in temporary format as produced by one of the bytecode variant backends into a semantically identical "program" with low level control flow statements such as gotos resolved into higher level looping/branching constructs where possible.

Rationale:

By disconnecting the control flow analysis part of the decompiler from bytecode dependent decompilation code, we'd go a long way of making the decompiler simpler to extend with support for new bytecode formats. Apart from that, the introduction of a temporary decompilation format might also help make more general ways of specifying bytecode formats in part one of antipasto apparent. Every bit of functionality for initial bytecode processing that can be ripped out of a bytecode variant backend and pressed deep into the guts of Dirty D eases future adding of support for new bytecode variants.

And that's what I'd like.