binary-re:triage

SKILL.md

Binary Triage (Phase 1)

Purpose

Quick fingerprinting to establish baseline facts before deeper analysis. Runs in seconds, not minutes.

When to Use

  • First contact with an unknown binary
  • Need architecture/ABI info for tool selection
  • Quick capability assessment
  • Before committing to expensive analysis

Key Principle

Gather facts fast, defer analysis.

This phase identifies WHAT the binary is, not HOW it works.

Triage Sequence

Step 1: File Identification

# Basic identification
file binary

# Expected output patterns:
# ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3
# ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1

Extract:

  • Architecture (ARM, ARM64, x86_64, MIPS)
  • Bit width (32/64)
  • Endianness (LSB/MSB)
  • Link type (static/dynamic)
  • Interpreter path (libc indicator)

Step 2: Structured Metadata (rabin2)

# All metadata as JSON
rabin2 -q -j -I binary | jq .

# Key fields:
# .arch     - "arm", "x86", "mips"
# .bits     - 32 or 64
# .endian   - "little" or "big"
# .os       - "linux", "none"
# .machine  - "ARM", "AARCH64"
# .stripped - true/false
# .static   - true/false

Step 3: ABI Detection

# Interpreter detection
readelf -p .interp binary 2>/dev/null

# Or via rabin2
rabin2 -I binary | grep interp

# ARM-specific: float ABI
readelf -A binary | grep "Tag_ABI_VFP_args"
# hard-float: "VFP registers"
# soft-float: missing or "compatible"

Interpreter → Libc mapping:

Interpreter Libc Notes
/lib/ld-linux-armhf.so.3 glibc ARM hard-float
/lib/ld-linux.so.3 glibc ARM soft-float
/lib/ld-musl-arm.so.1 musl ARM 32-bit
/lib/ld-musl-aarch64.so.1 musl ARM 64-bit
/lib/ld-uClibc.so.0 uClibc Embedded
/lib64/ld-linux-x86-64.so.2 glibc x86_64

Step 4: Dependencies

# Library dependencies
rabin2 -q -j -l binary | jq '.libs[]'

# Common patterns:
# libcurl.so.* → HTTP client
# libssl.so.* → TLS/crypto
# libpthread.so.* → Threading
# libz.so.* → Compression
# libsqlite3.so.* → Local database

Step 5: Entry Points & Exports

# Entry points
rabin2 -q -j -e binary | jq .

# Exports (for shared libraries)
rabin2 -q -j -E binary | jq '.exports[] | {name, vaddr}'

Step 6: Quick String Scan

# All strings with metadata
rabin2 -q -j -zz binary | jq '.strings | length'  # Count first

# Filter interesting strings (URLs, paths, errors)
rabin2 -q -j -zz binary | jq '
  .strings[] |
  select(.length > 8) |
  select(.string | test("http|ftp|/etc|/var|error|fail|pass|key|token"; "i"))
'

Step 7: Import Analysis

# All imports
rabin2 -q -j -i binary | jq '.imports[] | {name, lib}'

# Group by capability
rabin2 -q -j -i binary | jq '
  .imports | group_by(.lib) |
  map({lib: .[0].lib, functions: [.[].name]})
'

Capability Mapping

Import Pattern Capability
socket, connect, send Network client
bind, listen, accept Network server
open, read, write File I/O
fork, exec*, system Process spawning
pthread_* Multi-threading
SSL_*, EVP_* Cryptography
dlopen, dlsym Dynamic loading
mmap, mprotect Memory manipulation

Output Format

After triage, record structured facts:

{
  "artifact": {
    "path": "/path/to/binary",
    "sha256": "abc123...",
    "size_bytes": 245760
  },
  "identification": {
    "arch": "arm",
    "bits": 32,
    "endian": "little",
    "os": "linux",
    "stripped": true,
    "static": false
  },
  "abi": {
    "interpreter": "/lib/ld-musl-arm.so.1",
    "libc": "musl",
    "float_abi": "hard"
  },
  "dependencies": [
    "libcurl.so.4",
    "libssl.so.1.1",
    "libz.so.1"
  ],
  "capabilities_inferred": [
    "network_client",
    "tls_encryption",
    "compression"
  ],
  "strings_of_interest": [
    {"value": "https://api.vendor.com/telemetry", "type": "url"},
    {"value": "/etc/config.json", "type": "path"}
  ],
  "complexity_estimate": {
    "functions": "unknown (stripped)",
    "strings": 847,
    "imports": 156
  }
}

Knowledge Journaling

After triage completes, record findings for episodic memory:

[BINARY-RE:triage] {filename} (sha256: {hash})

Identification:
  Architecture: {arch} {bits}-bit {endian}
  Libc: {glibc|musl|uclibc} ({interpreter_path})
  Stripped: {yes|no}
  Size: {bytes}

FACT: Links against {library} (source: rabin2 -l)
FACT: Contains {N} strings of interest (source: rabin2 -zz)
FACT: Imports {function} from {library} (source: rabin2 -i)

Capabilities inferred:
  - {capability_1} (evidence: {import/string})
  - {capability_2} (evidence: {import/string})

HYPOTHESIS: {what binary likely does} (confidence: {0.0-1.0})

QUESTION: {open unknown that needs investigation}

Next phase: {static-analysis|dynamic-analysis}
Sysroot needed: {path or "extract from device"}

Example Journal Entry

[BINARY-RE:triage] thermostat_daemon (sha256: a1b2c3d4...)

Identification:
  Architecture: ARM 32-bit LE
  Libc: musl (/lib/ld-musl-arm.so.1)
  Stripped: yes
  Size: 153,600 bytes

FACT: Links against libcurl.so.4 (source: rabin2 -l)
FACT: Links against libssl.so.1.1 (source: rabin2 -l)
FACT: Contains string "api.thermco.com" (source: rabin2 -zz)
FACT: Imports curl_easy_perform (source: rabin2 -i)

Capabilities inferred:
  - HTTP client (evidence: libcurl import)
  - TLS encryption (evidence: libssl import)
  - Network communication (evidence: URL string)

HYPOTHESIS: Telemetry client that reports to api.thermco.com (confidence: 0.6)

QUESTION: What data does it collect and transmit?

Next phase: static-analysis
Sysroot needed: musl ARM (extract from device or Alpine)

Decision Points

After triage, determine:

  1. Sysroot selection - Based on arch + libc
  2. Analysis tool chain - r2 vs Ghidra vs both
  3. Dynamic analysis feasibility - QEMU viability based on arch
  4. Initial hypotheses - What does this binary likely do?

Next Steps

→ Proceed to binary-re-static-analysis for function enumeration → Or binary-re-dynamic-analysis if behavior observation is priority

Weekly Installs
10
GitHub Stars
25
First Seen
Jan 26, 2026
Installed on
opencode10
gemini-cli10
codex10
claude-code9
github-copilot9
antigravity8