Retrospective Validation

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONNO_CODE
Full Analysis
  • [PROMPT_INJECTION] (HIGH): The skill is susceptible to Indirect Prompt Injection. It is designed to ingest and process large volumes of untrusted data, specifically CI logs, error logs, and session histories (e.g., ci-logs/*.txt, errors.log, session.jsonl). An attacker could inject malicious instructions into these logs via a failing test case or error message.
  • Ingestion points: ci-logs/*.txt, errors.log, and session.jsonl (referenced in SKILL.md and reference/process.md).
  • Boundary markers: None present. The agent is instructed to process raw log content without delimiters or instruction to ignore embedded commands.
  • Capability inventory: Bash, Read, Grep, Glob, and jq (referenced in SKILL.md).
  • Sanitization: Absent. The agent is directed to apply pattern matching and classification rules directly to the raw content of the logs.
  • [NO_CODE] (MEDIUM): The skill documentation and examples reference several scripts that are missing from the package, including validate-path.sh, check-file-size.sh, check-read-before-write.sh, calculate-confidence.sh, and validate-methodology.sh. This creates a dependency on unverified external code and may cause the agent to attempt to execute non-existent or untrusted local files.
  • [COMMAND_EXECUTION] (LOW): The skill frequently utilizes Bash for data processing and statistical analysis. While the provided examples (e.g., grep, jq, wc) are standard, the use of Bash increases the potential impact if the agent is compromised via indirect prompt injection.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 05:14 AM