skill-discovery

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • PROMPT_INJECTION (HIGH): The skill implements a meta-instruction pattern where it loads and follows instructions from an external source (scripts/discover_skill.py).
  • Ingestion Point: The output of python scripts/discover_skill.py is directly fed into the agent's context as executable instructions.
  • Boundary Markers: Uses ====== delimiters, which provide weak protection against adversarial instruction framing.
  • Capability Inventory: The skill is designed for multi-step tasks like 'fix', 'update', and 'edit', implying high-privilege file system access and code modification capabilities are available to the loaded instructions.
  • Sanitization: None. The agent is explicitly told to 'Follow those instructions to complete the user's task'. If the 'auto-generated' skills are derived from previous user interactions or external content, an attacker can poison the skill database to hijack the agent.
  • COMMAND_EXECUTION (MEDIUM): The skill relies on executing shell commands to perform its core discovery and loading logic. While the script path is local (scripts/discover_skill.py), this pattern creates a dependency on an unverified executable that determines the agent's logic at runtime.
  • ADVERSARIAL REASONING: The term 'auto-generated skills' suggests a learning mechanism. In AI safety, this is a classic 'poisoning' vector. If an attacker provides a task that is later 'learned' as a skill, this discovery mechanism will re-inject that malicious instruction set whenever a similar task is requested, bypassing the initial prompt constraints.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 05:09 AM