The Agent Skills Directory

PROMPT_INJECTION (LOW): The skill is susceptible to Indirect Prompt Injection due to how it processes test scenario files.
Ingestion points: The skill ingests untrusted data from tests/scenarios.md located within target skill directories.
Boundary markers: The prompt template used in Step 2 interpolates the {evaluation_query} directly without using delimiters or instructions to ignore embedded commands.
Capability inventory: The skill has the capability to spawn sub-agents that execute code, read files via Glob/Read, and modify local README files.
Sanitization: There is no evidence of sanitization or validation of the queries or expected behaviors parsed from the scenario files.
SAFE (SAFE): No instances of data exfiltration, hardcoded credentials, obfuscation, or persistence mechanisms were detected. The skill's functionality is consistent with its stated purpose of evaluating AI model performance.

evaluating-skills-with-models