rigorous-reasoning

Pass

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: SAFEPROMPT_INJECTIONNO_CODE
Full Analysis
  • [PROMPT_INJECTION]: The skill uses directive language to explicitly override default agent behaviors. Evidence includes phrases such as 'CRITICAL: This skill overrides conversational tendencies toward politeness' and 'priority signals... these principles override default conversational tendencies' in SKILL.md. While these instructions are intended to enforce the skill's anti-sycophancy reasoning model, they utilize patterns commonly used to bypass behavioral constraints.
  • [PROMPT_INJECTION]: The skill establishes an indirect prompt injection surface by processing untrusted user data through its reasoning modules. Evidence chain: (1) Ingestion points: The conflict-analysis.md and debate-methodology.md modules process user-provided descriptions of conflicts and arguments. (2) Boundary markers: No explicit delimiters or instructions to ignore embedded commands are defined for the input data. (3) Capability inventory: The skill is restricted to text generation and reasoning; it does not utilize subprocesses, network tools, or file-writing capabilities. (4) Sanitization: No validation or escaping of external content is implemented.
  • [NO_CODE]: The skill does not include any executable scripts, binaries, or configuration files that trigger code execution. All logic and protocols are contained within Markdown files.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 5, 2026, 01:42 PM