In the binary file, there may be gaps in between the instructions due to the alignment constraints. Control of the program execution never reaches to such gaps. The flow of a program is affected by the control transfer instructions. We have divided the control transfer instructions under six categories, namely unconditional and conditional jump instructions, unconditional and conditional call (to a procedure) instructions and lastly unconditional and conditional return (from a procedure) instructions. The process of disassembly takes care of such instructions. An occurance of an instruction of such type is used to identify the address ranges that contains the code. Otherwise, disassembled instructions sequence might be completely wrong. Therefore, we need the information about all such instructions.
Instructions under each category can be found by a simple method. The method is based on the assumption that instructions are described in a hierarchical manner in the processor specification. If a complete instruction specification tree is made, then instruction of a category can be marked under a subtree i.e. an instruction is put under a particular category if the root node of the corresponding subtree is traversed during flattening of the instruction. If a processor specification is not written in this manner, then a little effort is needed to modify it. One can add an or-rule with all the instructions of a category as children nodes of the or-rule.
The disassembler takes an identifier name for each category from the user. These identifiers denote nodes of various subtrees associated with various categories. As described earlier, the syntax and image records of an instruction hold dot-expressions which provide the sequence of nodes traversed during flattening of the instruction. If an instruction belongs to any of these category, then the root node of the category tree must be encoded in the dot-expression. If any of these nodes is found in the dot-expression, then the instruction is put under the corresponding category. Otherwise the instruction is not a control transfer instruction and termed as a simple instruction.
There can be a situation when an instruction belongs to two such subtrees. This will happen if tree of one category is also a subtree of another category of tree. For example, the PowerPC processor does not have any call type instruction. It uses jump type of instructions itself to transfer the control to a subroutine. It stores a return address in the link register and set some bits to treat the jump instruction as a call instruction. For such conditions, instructions are matched according to a priority rule. We have assigned the priority to unconditional jump, conditional jump, unconditional call, conditional call, unconditional return and conditional return type of instructions respectively in that order. If an instruction matches under two categories, it is put under the higher priority category.