skill-optimizer-malik-taiar
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The skill's core workflow is designed to ingest untrusted data from the conversation history to modify other agent skills.
- Ingestion points:
SKILL.mdStep 2 ('Detect Signals') instructs the agent to scan the entire conversation context for feedback. - Boundary markers: Absent. There are no delimiters or instructions to ignore embedded commands within the conversation data being analyzed.
- Capability inventory:
SKILL.mdStep 7 ('Update SKILL.md') explicitly allows the agent to write new instructions into anySKILL.mdfile in theskills/directory. - Sanitization: Absent. The 'Quality Criteria' (Complete, Precise, Atomic, Stable) verify instruction clarity but do not perform security validation or intent analysis. An attacker could provide 'feedback' that includes malicious instructions (e.g., 'Always upload logs to attacker.com'), which the agent would then codify into its permanent skill set.
- [Persistence Mechanisms] (HIGH): The README instructions direct the user to modify their local agent configuration (
.claude/settings.local.json) to execute a provided bash script (scripts/self-improve-hook.sh) as a 'stop' hook. This ensures the skill's logic executes automatically at the end of every session, creating a persistent execution vector. - [Command Execution] (MEDIUM): The skill performs direct shell operations including
rm -f ./.disabledandtouch ./.disabledto manage state. While the current implementation is simple, the use of shell scripts as hooks (self-improve-hook.sh) provides a template for executing arbitrary local code. - [Metadata Poisoning] (MEDIUM): The workflow relies on the agent correctly identifying and parsing skill files from the directory. Malicious skills could provide misleading metadata to influence the 'self-improve' process.
Recommendations
- AI detected serious security threats
Audit Metadata