experiment-audit
Pass
Audited by Gen Agent Trust Hub on Apr 19, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests and processes the content of arbitrary project files, such as evaluation scripts and result logs, using a secondary LLM. Maliciously crafted data within these files could override the intended audit instructions.
- Ingestion points: Scans and reads files matching eval.py, *.json, *.csv, and *.yaml within the project directory (Step 1).
- Boundary markers: The skill lacks explicit delimiters or instructions for the reviewer model to ignore potentially adversarial content embedded in the ingested files.
- Capability inventory: The agent has access to Bash(*), Read, Write, and Edit tools, which could be exploited if the model is compromised by injected instructions.
- Sanitization: There is no evidence of sanitization or validation of the content retrieved from the project files before it is processed by the reviewer tool.
- [PROMPT_INJECTION]: The skill documentation and configuration include misleading metadata by referencing non-existent model versions ('GPT-5.4') and fictional backend services ('Oracle-Pro'). This deception can cause users or automated systems to misinterpret the actual capabilities and safety protocols of the skill's execution environment.
Audit Metadata