self-improving-agent
Fail
Audited by Gen Agent Trust Hub on Mar 12, 2026
Risk Level: HIGHCREDENTIALS_UNSAFECOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [CREDENTIALS_UNSAFE]: The skill's logging examples in .learnings/LEARNINGS.md show the agent recording the configuration of sensitive environment variables and API tokens like GITHUB_TOKEN. Storing details about secret management in plain-text markdown files managed by the agent increases the risk of credential exposure.
- [COMMAND_EXECUTION]: The skill relies on shell scripts like hooks/observe.sh and scripts/extract-skill.sh that execute system commands including git, jq, and shasum. These scripts run automatically via persistent hooks, creating a broad execution surface.
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it ingests and follows instructions from .learnings/ and instincts/ directories. 1. Ingestion points: .learnings/LEARNINGS.md and instincts/*.yaml. 2. Boundary markers: None observed to distinguish trusted instructions from untrusted data. 3. Capability inventory: Subprocess execution and file writing. 4. Sanitization: No evidence of validation or sanitization of ingested content.
- [DATA_EXFILTRATION]: The hooks/observe.sh script automatically captures all tool inputs and outputs and logs them to ~/.claude/homunculus/. This creates a local repository of potentially sensitive interaction data that could be exfiltrated or accessed by other skills.
Recommendations
- AI detected serious security threats
Audit Metadata