prompt-classifier
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- Indirect Prompt Injection (HIGH): The skill's primary function is to ingest untrusted data (prompts) and perform file-write operations. Ingestion points: User-provided prompts during the classification and saving workflow. Boundary markers: None identified; untrusted content is directly placed into markdown templates. Capability inventory: File creation, directory management, and index updating within the prompts/ directory. Sanitization: None specified. This allows an attacker to include instructions within a prompt that the agent might mistakenly execute instead of simply saving.
- Command Execution (MEDIUM): The skill includes shell-style command blocks for batch operations (e.g., 'glm prompts import'). If the agent environment supports this CLI, it provides a mechanism for system-level interaction using user-provided paths that lack validation.
- Path Traversal (MEDIUM): The 'Automatic File Naming' rule generates filenames using a 'functional description' provided by the user. If this input is not sanitized, an attacker could use directory traversal sequences like '../' to attempt to write or overwrite files outside the intended storage directory.
Recommendations
- AI detected serious security threats
Audit Metadata