self-improving-agent

Fail

Audited by Gen Agent Trust Hub on Mar 12, 2026

Risk Level: HIGHCREDENTIALS_UNSAFECOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • [CREDENTIALS_UNSAFE]: The skill's logging examples in .learnings/LEARNINGS.md show the agent recording the configuration of sensitive environment variables and API tokens like GITHUB_TOKEN. Storing details about secret management in plain-text markdown files managed by the agent increases the risk of credential exposure.
  • [COMMAND_EXECUTION]: The skill relies on shell scripts like hooks/observe.sh and scripts/extract-skill.sh that execute system commands including git, jq, and shasum. These scripts run automatically via persistent hooks, creating a broad execution surface.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it ingests and follows instructions from .learnings/ and instincts/ directories. 1. Ingestion points: .learnings/LEARNINGS.md and instincts/*.yaml. 2. Boundary markers: None observed to distinguish trusted instructions from untrusted data. 3. Capability inventory: Subprocess execution and file writing. 4. Sanitization: No evidence of validation or sanitization of ingested content.
  • [DATA_EXFILTRATION]: The hooks/observe.sh script automatically captures all tool inputs and outputs and logs them to ~/.claude/homunculus/. This creates a local repository of potentially sensitive interaction data that could be exfiltrated or accessed by other skills.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 12, 2026, 08:03 AM