debug

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill workflow explicitly instructs the agent to execute a string variable CURL_COMMAND using eval. This creates a direct path for local command injection if the content of the variable is malicious.
  • REMOTE_CODE_EXECUTION (HIGH): The command being executed is generated by a remote service (testany_log_sign). This makes the security of the local system dependent on the integrity of an external API.
  • PROMPT_INJECTION (MEDIUM): The security model relies on the LLM to perform its own validation (regex checks for domains and parameters). LLM-based validation is unreliable and can be bypassed by adversarial content within the tool output or test logs.
  • INDIRECT_PROMPT_INJECTION (HIGH):
  • Ingestion points: Test execution details and logs retrieved via testany_get_execution and the generated curl command.
  • Boundary markers: None identified in the prompt to separate untrusted data from instructions.
  • Capability inventory: Ability to execute shell commands via eval and curl.
  • Sanitization: The skill suggests a bash-based check, but the execution still happens within the agent's context, and the validation logic itself can be subverted by malicious input.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 09:27 AM