The generic disassembler, produces symbolic relocatable disassembly of an object file in Executable and Linking Format (ELF). It uses a processor model that contains instruction set description of the processors in a language called Sim-nML. Sim-nML is simple, elegant and powerful language to express the behavior of processors at instruction level. It uses synthesized attributes to represent timing information, instruction semantics, assembly language syntax and binary representation of instructions. Generic Disassembler facilitates disassembly of programs in a GNU-compatible format. For identifying the instructions, depth first search and backtracking is used on a tree like structure of the Sim-nML instruction set description. Since the attributes of an instruction are scattered in various subtrees, syntax for the instruction is collected from the subtrees selected during traversal. Different parts of a single instruction may be matched with different subtrees. Symbolic and relocatable disassembly is achieved by using relocation and symbol information from the object file and analyzing the code to identify basic blocks.
Full Thesis (PS-gzipped: 111KB)