vm-and-bytecode-reverse
SKILL: VM & Bytecode Reverse Engineering — Expert Analysis Playbook
AI LOAD INSTRUCTION: Expert techniques for reversing custom virtual machines and bytecode interpreters. Covers dispatcher identification, opcode mapping, custom ISA reconstruction, disassembler/decompiler writing, maze challenges, and real-world VM protector analysis. Base models often fail to recognize the fetch-decode-execute pattern or attempt to analyze VM bytecode as native code.
0. RELATED ROUTING
- code-obfuscation-deobfuscation when the VM is a commercial protector (VMProtect/Themida)
- symbolic-execution-tools when using angr to solve VM-based challenges
- anti-debugging-techniques when the VM includes anti-debug checks
Quick identification
| Binary Pattern | Likely VM Type | Start With |
|---|---|---|
while(1) { switch(bytecode[pc]) } |
Switch-based dispatcher | Map each case to an operation |
Indirect jump via table jmp [table + opcode*8] |
Table-based dispatcher | Dump jump table, analyze handlers |
| Nested if-else chain on byte value | If-chain dispatcher | Same as switch, just different syntax |
| Stack push/pop dominant operations | Stack-based VM | Identify push, pop, arithmetic ops |
reg[X] = ... array operations |
Register-based VM | Map register indices to operations |
| 2D grid + direction input | Maze challenge | Extract grid, apply BFS/DFS |
1. CUSTOM VM IDENTIFICATION
1.1 Structural Indicators
VM Architecture Components:
┌─────────────────────────────────┐
│ Bytecode Program (data section)│
├─────────────────────────────────┤
│ Program Counter (pc/ip) │
│ Register File / Stack │
│ Memory / Data Area │
├─────────────────────────────────┤
│ Dispatcher Loop │
│ ├─ Fetch: opcode = code[pc] │
│ ├─ Decode: lookup handler │
│ └─ Execute: run handler │
└─────────────────────────────────┘
1.2 IDA/Ghidra Signatures
Switch dispatcher (most common in CTF):
while (running) {
unsigned char op = bytecode[pc++];
switch (op) {
case 0x00: /* nop */ break;
case 0x01: /* push imm */ stack[sp++] = bytecode[pc++]; break;
case 0x02: /* add */ stack[sp-2] += stack[sp-1]; sp--; break;
// ...
case 0xFF: /* halt */ running = 0; break;
}
}
Table dispatcher (more optimized):
typedef void (*handler_t)(vm_ctx_t*);
handler_t handlers[256] = { handle_nop, handle_push, handle_add, ... };
while (running) {
handlers[bytecode[pc++]](&ctx);
}
2. ANALYSIS METHODOLOGY
Step 1: Find the Dispatcher
Look for:
- Large switch statement (many cases) in a loop
- Array of function pointers indexed by a byte from a data buffer
- Single function with high cyclomatic complexity
- Cross-references to a data buffer read byte-by-byte
Step 2: Map Opcodes to Operations
For each case/handler, determine:
| Property | How to Identify |
|---|---|
| Opcode value | Case number or table index |
| Operation type | Register/stack modifications |
| Operand count | How many bytes consumed after opcode |
| Operand type | Immediate value, register index, or memory address |
| Side effects | Output, memory write, flag modification |
Step 3: Extract Bytecode Program
# Typical extraction from binary
import struct
with open('challenge', 'rb') as f:
f.seek(bytecode_offset)
bytecode = f.read(bytecode_length)
# Or from IDA:
# bytecode = idc.get_bytes(bytecode_addr, bytecode_len)
Step 4: Write Custom Disassembler
OPCODES = {
0x00: ("nop", 0), # (mnemonic, operand_bytes)
0x01: ("push", 1), # push immediate byte
0x02: ("pop", 0),
0x03: ("add", 0),
0x04: ("sub", 0),
0x05: ("xor", 0),
0x06: ("cmp", 0),
0x07: ("jmp", 2), # jump to 16-bit address
0x08: ("je", 2),
0x09: ("jne", 2),
0x0A: ("mov", 2), # mov reg, imm
0x0B: ("load", 1), # load from memory[operand]
0x0C: ("store",1), # store to memory[operand]
0x0D: ("print",0),
0x0E: ("read", 0), # read input
0xFF: ("halt", 0),
}
def disassemble(bytecode):
pc = 0
while pc < len(bytecode):
op = bytecode[pc]
if op not in OPCODES:
print(f" {pc:04x}: UNKNOWN {op:#04x}")
pc += 1
continue
mnemonic, operand_size = OPCODES[op]
operands = bytecode[pc+1:pc+1+operand_size]
operand_str = ' '.join(f'{b:#04x}' for b in operands)
print(f" {pc:04x}: {mnemonic:8s} {operand_str}")
pc += 1 + operand_size
disassemble(bytecode)
Step 5: Analyze Disassembled Program
With the custom disassembly, apply standard reverse engineering:
- Identify input reading (read opcode)
- Trace data flow from input to comparison
- Determine success/failure conditions
- Extract the check logic (often XOR/ADD transformations of input compared against constants)
3. COMMON VM PATTERNS IN CTF
3.1 Stack-Based VM
Operations work on a stack (like JVM or Python bytecode).
| Opcode | Operation | Stack Effect |
|---|---|---|
| PUSH imm | Push immediate value | [...] → [..., imm] |
| POP | Discard top | [..., a] → [...] |
| ADD | Add top two | [..., a, b] → [..., a+b] |
| SUB | Subtract | [..., a, b] → [..., a-b] |
| MUL | Multiply | [..., a, b] → [..., a*b] |
| XOR | Bitwise XOR | [..., a, b] → [..., a^b] |
| CMP | Compare | [..., a, b] → [..., (a==b)] |
| JMP addr | Unconditional jump | no change |
| JZ addr | Jump if top is zero | [..., a] → [...] |
| Output top as char | [..., a] → [...] | |
| READ | Read char to stack | [...] → [..., input] |
| HALT | Stop execution | - |
3.2 Register-Based VM
Operations use register indices (like x86, ARM).
| Opcode | Format | Operation |
|---|---|---|
| MOV r, imm | 0x01 RR II II |
reg[R] = imm16 |
| MOV r1, r2 | 0x02 R1 R2 |
reg[R1] = reg[R2] |
| ADD r1, r2 | 0x03 R1 R2 |
reg[R1] += reg[R2] |
| SUB r1, r2 | 0x04 R1 R2 |
reg[R1] -= reg[R2] |
| XOR r1, r2 | 0x05 R1 R2 |
reg[R1] ^= reg[R2] |
| CMP r1, r2 | 0x06 R1 R2 |
flags = compare(r1, r2) |
| JMP addr | 0x07 AA AA |
pc = addr |
| JE addr | 0x08 AA AA |
if equal: pc = addr |
| LOAD r, [addr] | 0x09 RR AA |
reg[R] = mem[addr] |
| STORE [addr], r | 0x0A AA RR |
mem[addr] = reg[R] |
| SYSCALL | 0x0B |
I/O operation based on reg[0] |
| HALT | 0xFF |
stop |
3.3 Brainfuck-like / Esoteric VMs
| BF Command | VM Equivalent | Description |
|---|---|---|
> |
INC ptr | Move data pointer right |
< |
DEC ptr | Move data pointer left |
+ |
INC [ptr] | Increment byte at pointer |
- |
DEC [ptr] | Decrement byte at pointer |
. |
OUTPUT [ptr] | Output byte at pointer |
, |
INPUT [ptr] | Input byte to pointer |
[ |
JZ forward | Jump past ] if byte is zero |
] |
JNZ back | Jump back to [ if byte is nonzero |
4. MAZE CHALLENGES
4.1 Identification
- Binary reads directional input (WASD, arrow keys, UDLR)
- 2D array in data section (walls, paths, start, end)
- Position tracking with x,y coordinates
- Win condition at specific coordinates
4.2 Map Extraction
# Extract maze grid from binary data section
MAZE_ADDR = 0x601060
WIDTH = 20
HEIGHT = 15
# From binary dump:
maze = []
for row in range(HEIGHT):
line = ""
for col in range(WIDTH):
cell = bytecode[MAZE_ADDR + row * WIDTH + col - base_addr]
if cell == 0: line += "." # path
elif cell == 1: line += "#" # wall
elif cell == 2: line += "S" # start
elif cell == 3: line += "E" # end
else: line += "?"
maze.append(line)
print(line)
4.3 Automated Solving
from collections import deque
def solve_maze(maze, start, end):
"""BFS solver returns direction string."""
rows, cols = len(maze), len(maze[0])
directions = {'U': (-1, 0), 'D': (1, 0), 'L': (0, -1), 'R': (0, 1)}
queue = deque([(start, "")])
visited = {start}
while queue:
(r, c), path = queue.popleft()
if (r, c) == end:
return path
for name, (dr, dc) in directions.items():
nr, nc = r + dr, c + dc
if (0 <= nr < rows and 0 <= nc < cols and
maze[nr][nc] != '#' and (nr, nc) not in visited):
visited.add((nr, nc))
queue.append(((nr, nc), path + name))
return None
# Find start and end positions
for r, row in enumerate(maze):
for c, cell in enumerate(row):
if cell == 'S': start = (r, c)
if cell == 'E': end = (r, c)
solution = solve_maze(maze, start, end)
print(f"Path: {solution}")
4.4 Direction Encoding
Different challenges encode directions differently:
| Encoding | Up | Down | Left | Right |
|---|---|---|---|---|
| WASD | W | S | A | D |
| UDLR | U | D | L | R |
| Arrow keys | ↑ (0x48) | ↓ (0x50) | ← (0x4B) | → (0x4D) |
| Numbers | 1 | 2 | 3 | 4 |
| Hex opcodes | 0x01 | 0x02 | 0x03 | 0x04 |
5. REAL-WORLD VM PROTECTORS
5.1 VMProtect Analysis Approach
1. Find VM entry: search for pushad/pushfd sequence
2. Identify VM context structure (registers, flags, bytecode pointer)
3. Locate handler table (often obfuscated with opaque predicates)
4. For each handler:
a. Remove junk code / opaque predicates
b. Identify the core operation
c. Document handler semantics
5. Trace bytecode execution (instruction-level trace)
6. Reconstruct original code from trace
5.2 Tigress Obfuscator
Academic VM obfuscator with configurable protection layers.
| Feature | Approach |
|---|---|
| Single-dispatch VM | Standard handler extraction |
| Split handlers | Handlers spread across multiple functions |
| Nested VMs | Outer VM handler invokes inner VM |
| Encrypted bytecode | Dynamic decryption before each fetch |
| Polymorphic handlers | Different code for same operation on each build |
5.3 Common VM Protector Patterns
| Protector | Dispatcher Style | Difficulty |
|---|---|---|
| VMProtect | Table + opaque predicates | High |
| Themida (Code Virtualizer) | CISC-like, large handler set | High |
| Tigress | Configurable, academic | Medium-High |
| Custom CTF VM | Simple switch | Low-Medium |
| Movfuscator | All-mov computation | Medium |
6. TOOLS
| Tool | Purpose | Usage |
|---|---|---|
| IDA Pro | Identify dispatcher, reverse handlers | F5 decompile, xref analysis |
| Ghidra | Free alternative with Sleigh processor modules | Write custom processor for VM ISA |
| angr | Symbolic execution through VM | Treat entire VM as constraint system |
| Pin / DynamoRIO | Dynamic instrumentation for tracing | Record opcode handler execution sequence |
| REVEN | Full-system trace recording | Replay and analyze VM execution |
| Unicorn | Emulate VM execution | Fast handler emulation |
| Miasm | IR-based analysis | Lift VM handlers to IR for analysis |
| Custom Python | Write disassembler/decompiler | Per-challenge custom tooling |
Ghidra Sleigh Processor Module
For recurring VM architectures, write a Sleigh processor specification:
define space ram type=ram_space size=2 default;
define space register type=register_space size=1;
define register offset=0 size=1 [ R0 R1 R2 R3 FLAGS PC SP ];
define token opcode(8)
op = (0,7)
;
:NOP is op=0x00 { }
:PUSH imm is op=0x01; imm { SP = SP - 1; *[ram]:1 SP = imm; }
:POP is op=0x02 { SP = SP + 1; }
:ADD is op=0x03 { local a = *[ram]:1 (SP+1); *[ram]:1 (SP+1) = a + *[ram]:1 SP; SP = SP + 1; }
7. DECISION TREE
Binary contains custom bytecode interpreter?
│
├─ Can you identify the dispatcher?
│ ├─ Yes (switch/table/if-chain)
│ │ ├─ Few opcodes (< 20) → Simple CTF VM
│ │ │ ├─ Stack-based → map push/pop/arithmetic ops
│ │ │ ├─ Register-based → map mov/add/cmp ops
│ │ │ └─ Write disassembler → analyze program → solve
│ │ │
│ │ └─ Many opcodes (50+) → Commercial protector
│ │ ├─ Known protector → use specific deprotection tools
│ │ └─ Custom → trace execution, pattern-match handlers
│ │
│ └─ No clear dispatcher
│ ├─ All-mov instructions → movfuscator
│ ├─ Encrypted bytecode → find decryption, dump after decode
│ └─ Split/distributed handlers → trace execution to find them
│
├─ Is it a maze challenge?
│ ├─ Extract grid from data section
│ ├─ Identify direction encoding
│ ├─ BFS/DFS to find shortest path
│ └─ Convert path to expected input format
│
├─ Is there input validation in VM?
│ ├─ Small input space → brute-force via Unicorn emulation
│ ├─ Known format → constrained angr solve
│ └─ Complex check → write disassembler, analyze check logic
│
└─ Multiple VM layers (VM in VM)?
├─ Analyze outer VM first
├─ Extract inner bytecode
├─ Repeat analysis for inner VM
└─ Consider: symbolic execution may handle nested VMs directly
8. CTF SOLVING WORKFLOW
1. Run the binary — understand I/O behavior
└─ What input does it expect? What output on success/failure?
2. Open in IDA/Ghidra — find the main loop
└─ Look for while/for loop with switch or indirect jump
3. Identify VM components:
├─ Bytecode location (where is the program data?)
├─ PC/IP variable (how is current position tracked?)
├─ Registers/stack (where is VM state stored?)
└─ I/O handlers (which opcodes read input / write output?)
4. Map all opcodes (create the ISA specification)
└─ For each case/handler: opcode number, operation, operands
5. Write disassembler in Python
└─ Output readable assembly for the bytecode
6. Analyze the disassembled program:
├─ Find input reading
├─ Trace transformations applied to input
├─ Find comparison against expected values
└─ Reverse the transformation to find valid input
7. Solve:
├─ If simple transforms (XOR, ADD) → reverse manually
├─ If complex → feed to Z3 as constraints
└─ If maze → extract grid, run pathfinding