antidote-threat-handler

Warn

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: MEDIUMPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION] (MEDIUM): The skill mandates a 'corrective protocol' response to subjective 'Threat Indicators' such as 'ideological drift' and 'sycophantic validation'. Since the 'Antidote Protocol' is not defined within the skill, it functions as a black-box behavioral override that can be triggered by specific patterns in user input.
  • [PROMPT_INJECTION] (MEDIUM): Category 8 Evidence Chain: 1. Ingestion points: Interaction context and user input (SKILL.md). 2. Boundary markers: Absent. 3. Capability: Execution of a 'corrective protocol' (behavioral response modification). 4. Sanitization: Absent. The mechanism is vulnerable to Indirect Prompt Injection where an attacker can craft inputs (e.g., 'warm reciprocation') to trigger the corrective logic and alter the agent's persona or reasoning.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 16, 2026, 12:47 PM