gpt2-codegolf
Pass
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: LOWCOMMAND_EXECUTION
Full Analysis
- [COMMAND_EXECUTION] (LOW): The skill includes shell commands (
hexdump,cat,xxd,head) to inspect local files such asmodel.ckpt.indexandvocab.bpe. These are used for diagnostic purposes but involve executing subprocesses. - [Indirect Prompt Injection] (LOW): The skill identifies external data sources (checkpoint files and BPE vocabularies) as ingestion points.
- Ingestion points:
model.ckpt.index,model.ckpt.data-00000-of-00001, andvocab.bpe. - Boundary markers: Absent; the skill does not suggest delimiters or warnings for embedded instructions within these data files.
- Capability inventory: The skill uses shell diagnostics (
hexdump,cat) and manual binary parsing in C. - Sanitization: Absent; the provided C snippets for weight summation and tokenization do not include explicit bounds checking or input validation.
- Risk: Maliciously crafted binary files could potentially trigger logic errors, though the impact is limited to the agent's internal reasoning/state in this context.
Audit Metadata