claude-realignment

Pass

Audited by Gen Agent Trust Hub on Mar 2, 2026

Risk Level: SAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill metadata contains activation triggers based on user sentiment (e.g., shouting, frustration). These are functional directives for the agent to initiate the realignment procedure and do not constitute a bypass of safety guidelines or an attempt to override core instructions.
  • [DATA_EXPOSURE]: The skill operates on the provided conversation history to perform its diagnosis. It does not access sensitive system paths, environment variables, or credentials. It suggests modifications to local configuration files (CLAUDE.md) as a remediation step, which is a standard method for agent behavior refinement.
  • [COMMAND_EXECUTION]: There are no instances of shell command execution, subprocess spawning, or arbitrary code evaluation. The analysis procedure relies on internal reasoning ("ultrathink") rather than external scripts.
  • [REMOTE_CODE_EXECUTION]: No remote downloads or network-based script executions are present in the skill definition.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 2, 2026, 10:13 PM