rigorous-reasoning
Pass
Audited by Gen Agent Trust Hub on Mar 5, 2026
Risk Level: SAFEPROMPT_INJECTIONNO_CODE
Full Analysis
- [PROMPT_INJECTION]: The skill uses directive language to explicitly override default agent behaviors. Evidence includes phrases such as 'CRITICAL: This skill overrides conversational tendencies toward politeness' and 'priority signals... these principles override default conversational tendencies' in SKILL.md. While these instructions are intended to enforce the skill's anti-sycophancy reasoning model, they utilize patterns commonly used to bypass behavioral constraints.
- [PROMPT_INJECTION]: The skill establishes an indirect prompt injection surface by processing untrusted user data through its reasoning modules. Evidence chain: (1) Ingestion points: The conflict-analysis.md and debate-methodology.md modules process user-provided descriptions of conflicts and arguments. (2) Boundary markers: No explicit delimiters or instructions to ignore embedded commands are defined for the input data. (3) Capability inventory: The skill is restricted to text generation and reasoning; it does not utilize subprocesses, network tools, or file-writing capabilities. (4) Sanitization: No validation or escaping of external content is implemented.
- [NO_CODE]: The skill does not include any executable scripts, binaries, or configuration files that trigger code execution. All logic and protocols are contained within Markdown files.
Audit Metadata