reverse-engineer-rpi
/reverse-engineer-rpi
Reverse-engineer a product into a mechanically verifiable feature inventory + registry + spec set, with optional security-audit artifacts and validation gates.
Hard Guardrails (MANDATORY)
- Only operate on code/binaries you own or have explicit written authorization to analyze.
- Do not provide steps to bypass protections/ToS or to extract proprietary source code/system prompts from third-party products.
- Do not output reconstructed proprietary source or embedded prompts from binaries (index only; redact in reports).
- Redact secrets/tokens/keys if encountered; run the secret-scan gate over outputs.
- Always separate: docs say vs code proves vs hosted/control-plane.
One-Command Example
python3 skills/reverse-engineer-rpi/scripts/reverse_engineer_rpi.py ao \
--authorized \
--mode=binary \
--binary-path="$(command -v ao)" \
--output-dir=".agents/research/ao/"
If you do not have explicit written authorization to analyze that binary, do not run the above. Use the included demo fixture instead (see Self-Test below).
Repo-only example (no binary required):
python3 skills/reverse-engineer-rpi/scripts/reverse_engineer_rpi.py cc-sdd \
--mode=repo \
--upstream-repo="https://github.com/gotalab/cc-sdd.git" \
--output-dir=".agents/research/cc-sdd/"
Pinned clone (reproducible):
python3 skills/reverse-engineer-rpi/scripts/reverse_engineer_rpi.py cc-sdd \
--mode=repo \
--upstream-repo="https://github.com/gotalab/cc-sdd.git" \
--upstream-ref=v1.0.0 \
--output-dir=".agents/research/cc-sdd/"
Invocation Contract
Required:
product_name
Optional:
--docs-sitemap-url(recommended when available; supportshttps://...andfile:///...)--docs-features-prefix(default:docs/features/)--upstream-repo(optional)--upstream-ref(pin clone to a specific commit, tag, or branch; records resolved SHA inclone-metadata.json)--local-clone-dir(default:.tmp/<product_name>)--output-dir(default:.agents/research/<product_name>/)--mode(default:repo; allowed:repo|binary|both)--binary-path(required if--modeincludesbinary)--no-materialize-archives(authorized-only; binary mode extracts embedded ZIPs by default; this disables extraction and keeps index-only)
Security audit flags (optional):
--security-audit(enables security artifacts + gates)--sbom(generate SBOM + dependency risk report where possible; may no-op with a note)--fuzz(only if a safe harness exists; timeboxed)
Mandatory guardrail flag:
--authorized(required for binary mode; refuses to run binary analysis without it)
Upstream Ref Pinning (--upstream-ref)
Use --upstream-ref to pin a repo-mode clone to a specific commit, tag, or branch. This makes analysis reproducible and allows golden fixtures to be diffed against a known baseline.
# Pin to a tag (reproducible)
python3 skills/reverse-engineer-rpi/scripts/reverse_engineer_rpi.py cc-sdd \
--mode=repo \
--upstream-repo="https://github.com/gotalab/cc-sdd.git" \
--upstream-ref=v1.0.0 \
--output-dir=".agents/research/cc-sdd/"
# Pin to a specific commit SHA
python3 skills/reverse-engineer-rpi/scripts/reverse_engineer_rpi.py cc-sdd \
--mode=repo \
--upstream-repo="https://github.com/gotalab/cc-sdd.git" \
--upstream-ref=abc1234 \
--output-dir=".agents/research/cc-sdd/"
When --upstream-ref is provided:
- The clone is fetched with
git fetch --depth=1 origin <ref>and checked out toFETCH_HEAD. - The resolved commit SHA is recorded in
output_dir/clone-metadata.jsonfor traceability. - Without
--upstream-ref, a--depth=1shallow clone of the default branch HEAD is used instead.
clone-metadata.json schema:
{
"upstream_repo": "https://github.com/gotalab/cc-sdd.git",
"upstream_ref": "v1.0.0",
"resolved_commit": "<full SHA>",
"clone_date": "YYYY-MM-DD"
}
Contract JSON Outputs (output_dir/contracts/)
Repo-mode analysis writes machine-checkable contract JSON under output_dir/contracts/. These files use only relative paths, sorted lists, and stable keys — no absolute paths, no run-specific timestamps — so they can be committed as golden fixtures and diffed across runs.
Primary contract file: output_dir/contracts/repo-contract.json
This file captures the mechanically-extracted CLI, config/env, and artifact surface of the target product. Fields include:
| Field | Description |
|---|---|
schema_version |
Integer; increment when field semantics change |
product_name |
Product name as supplied to the script |
upstream_commit |
Git HEAD SHA of the analysis root (when a git repo) |
cli |
CLI surface: bin map, help_text, framework, language |
config_env |
Config file path and env vars with per-var file evidence (relative paths, sorted) |
artifact_io |
Manifest inventory and template file hashes (from artifact-registry.json) |
schema_files |
Discovered schema-like files (JSON Schema, OpenAPI, protobuf, etc.) |
Example repo-contract.json structure:
{
"schema_version": 1,
"product_name": "cc-sdd",
"upstream_commit": "abc1234...",
"cli": {
"language": "node",
"framework": null,
"bin": { "cc-sdd": "dist/cli.js" },
"help_text": "Usage: cc-sdd [options] ..."
},
"config_env": {
"config_file": ".cc-sdd/config.json",
"env_vars": [
{ "name": "CC_SDD_TOKEN", "evidence": ["src/config.ts"] }
]
},
"artifact_io": {
"manifests": ["templates/manifests/default.json"],
"template_files": 12
},
"schema_files": []
}
Note:
output_dir/contracts/is written by--mode=repo(or--mode=both). Binary-mode outputs (binary-analysis.md,binary-symbols.txt, etc.) remain directly underoutput_dir/.
Fixture Test Workflow
Golden fixtures allow regression detection: commit a known-good contracts/ snapshot alongside the pinned clone-metadata.json, then diff future runs against it.
Running Fixture Tests
bash skills/reverse-engineer-rpi/scripts/repo_fixture_test.sh
This script (implemented in ag-w77.3):
- Reads
skills/reverse-engineer-rpi/fixtures/cc-sdd/clone-metadata.jsonto determine the pinned upstream ref. - Runs
reverse_engineer_rpi.pyin repo mode with that ref into a temp output dir. - Diffs the generated
contracts/repo-contract.jsonagainst the committed golden fixture. - Exits 0 if they match; exits non-zero with a unified diff if they drift.
The test requires network access to clone the upstream repo.
Updating Fixtures
When contracts legitimately change (new flags, new env vars, schema bumps), update the golden fixtures:
# 1. Re-run with the pinned ref to generate fresh contracts
python3 skills/reverse-engineer-rpi/scripts/reverse_engineer_rpi.py cc-sdd \
--mode=repo \
--upstream-repo="https://github.com/gotalab/cc-sdd.git" \
--upstream-ref=<new-tag-or-sha> \
--output-dir=".tmp/cc-sdd-refresh/"
# 2. Copy contracts into the fixture directory
cp -r .tmp/cc-sdd-refresh/contracts/ \
skills/reverse-engineer-rpi/fixtures/cc-sdd/contracts/
# 3. Update the pinned clone metadata
cp .tmp/cc-sdd-refresh/clone-metadata.json \
skills/reverse-engineer-rpi/fixtures/cc-sdd/clone-metadata.json
# 4. Commit the updated fixtures
git add skills/reverse-engineer-rpi/fixtures/cc-sdd/
git commit -m "fix(reverse-engineer-rpi): update cc-sdd golden fixtures to <new-tag-or-sha>"
Fixture files that must be committed for the test to pass:
skills/reverse-engineer-rpi/fixtures/cc-sdd/clone-metadata.jsonskills/reverse-engineer-rpi/fixtures/cc-sdd/contracts/repo-contract.json
Script-Driven Workflow
Run:
python3 skills/reverse-engineer-rpi/scripts/reverse_engineer_rpi.py <product_name> --authorized [flags...]
This generates the required outputs under output_dir/ and (when applicable) .agents/council/ and .agents/learnings/.
Outputs (MUST be generated)
Core outputs under output_dir/:
feature-inventory.mdfeature-registry.yamlvalidate-feature-registry.pyfeature-catalog.mdspec-architecture.mdspec-code-map.mdspec-cli-surface.md(Node, Python, or Go CLI detected; otherwise a note is written tospec-code-map.md)spec-clone-vs-use.mdspec-clone-mvp.md(original MVP spec; do not copy from target)clone-metadata.json(when--upstream-repois used; records resolved commit SHA)
Binary-mode extras:
binary-analysis.md(best-effort summary)binary-embedded-archives.md(index only; no dumps)
Repo-mode extras:
spec-artifact-surface.md(best-effort; template/manifest driven install surface)artifact-registry.json(best-effort; hashed template inventory when manifests/templates exist)
If --security-audit, also create output_dir/security/:
threat-model.mdattack-surface.mddataflow.mdcrypto-review.mdauthn-authz.mdfindings.mdreproducibility.mdvalidate-security-audit.sh
Self-Test (Acceptance Criteria)
End-to-end fixture (safe, owned demo binary with embedded ZIP):
bash skills/reverse-engineer-rpi/scripts/self_test.sh
This must show:
- feature inventory generated
- registry generated
- registry validator exits 0
- in security mode:
validate-security-audit.shexits 0 and secret scan passes