NYC

self-improving-agent

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (MEDIUM): The skill provides shell scripts (pre-tool.sh, post-bash.sh) intended to be executed by the agent on every tool invocation. While the provided scripts only log to stderr, this mechanism grants the skill the ability to run arbitrary code triggered by user or agent actions.
  • PROMPT_INJECTION (LOW): (Category 8: Indirect Prompt Injection) The skill is designed to ingest untrusted data from tool outputs (PostToolUse) and use it to 'update related skills' and 'fix skill guidance.' There is no evidence of sanitization or boundary markers to prevent malicious instructions in tool outputs from being 'learned' as legitimate updates.
  • Ingestion points: hooks/post-bash.sh captures tool outputs via the $TOOL_OUTPUT argument.
  • Boundary markers: None present in the provided templates or hook scripts.
  • Capability inventory: The README explicitly claims 'Automatic Updates' to related skills and 'Self-Correction' of skill guidance.
  • Sanitization: No sanitization or validation logic is present to filter malicious patterns from the ingested data.
  • DATA_EXFILTRATION (LOW): The hooks log full tool inputs and outputs to the standard error stream. While not directly exfiltrating to a network, this exposes potentially sensitive data (like API keys or private files processed by tools) to any system logging or monitoring the agent's stderr.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 04:48 PM