warden

Fail

Audited by Gen Agent Trust Hub on Apr 28, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONCREDENTIALS_UNSAFEPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill implements warden add --remote and warden sync commands which fetch content from user-specified external GitHub repositories as described in the CLI reference and creating-skills documentation.
  • [REMOTE_CODE_EXECUTION]: The system allows fetching "remote skills" from arbitrary GitHub repositories. These skills contain instructions that are interpreted and executed by the AI agent, providing a direct mechanism for remote instruction injection from untrusted sources.
  • [COMMAND_EXECUTION]: The skill relies on executing the warden CLI tool locally, including the warden setup-app command which opens a local web server on port 3000 to handle GitHub App manifest flows.
  • [CREDENTIALS_UNSAFE]: The skill requires the WARDEN_ANTHROPIC_API_KEY for operation and guides users through GitHub App setup which involves handling sensitive credentials and tokens.
  • [PROMPT_INJECTION]: The skill's primary purpose is analyzing untrusted data (code changes and pull requests), creating an indirect prompt injection surface where malicious instructions embedded in code could manipulate the agent's behavior during the review process.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 28, 2026, 08:58 PM