self-improvement

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION] (HIGH): Vulnerable to Indirect Prompt Injection through untrusted data ingestion.
  • Ingestion Points: GitHub PR descriptions, review comments, and issue reports (via learn-from-pr.md and self-improve.md).
  • Boundary Markers: Absent. The instructions do not provide delimiters or warnings to ignore embedded instructions within processed data.
  • Capability Inventory: The skill is explicitly designed to modify or create files in .cursor/rules/, which serve as the agent's core instructions.
  • Sanitization: Absent. There is no logic mentioned for filtering or sanitizing malicious instructions before they are converted into rules.
  • [COMMAND_EXECUTION] (MEDIUM): Automated modification of persistence files.
  • Evidence: The skill automates the creation and updating of .mdc files in .cursor/rules/. Modifying instruction files is a high-privilege action that persists behavior changes across all future agent sessions.
  • [EXTERNAL_DOWNLOADS] (LOW): Use of external fetch tools.
  • Evidence: Uses mcp_web_fetch and GitHub API to retrieve external content. While necessary for the stated purpose, it establishes a data ingestion path from potentially attacker-controlled sources.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 05:01 AM