benchmark-worker

Warn

Audited by Snyk on Mar 25, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 0.90). The worker polls the public Benchmark API (https://tapis1.awp.sh) and GET /api/v1/benchmark-sets and /api/v1/poll responses (and even a remote raw GitHub URL for updates) are ingested at runtime and server-provided fields such as assignment["prompt"] and benchmark set descriptions are directly prepended into the LLM prompts (see build_answer_prompt / build_question_prompt and the signed_request/_handle_ask/_handle_answer flows), so untrusted third-party content can carry instructions that materially influence agent behavior.

MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).

  • Potentially malicious external URL detected (high risk: 0.90). The skill calls the Benchmark API at https://tapis1.awp.sh at runtime (e.g., /api/v1/poll) and the code explicitly prepends any server-provided "prompt" from the API into LLM prompts, so remote content from that URL can directly control agent instructions and is a required runtime dependency.

MEDIUM W009: Direct money access capability detected (payment gateways, crypto, banking).

  • Direct money access detected (high risk: 1.00). The skill explicitly interacts with crypto wallets and signing utilities: it auto-detects and uses a wallet address, calls awp-wallet (awp-wallet receive / init / unlock), references WALLET_PASSWORD, AWP_SESSION_TOKEN, private keys/mnemonics, targets BSC network (testnet), and uses a signing helper ({baseDir}/scripts/benchmark-sign.sh) for API calls. It is specifically designed to earn token rewards on a blockchain subnet (AWP) and performs wallet-related operations and request signing. Although it says it's not for wallet transfers, the presence of wallet init/unlock and signing utilities constitutes explicit crypto/wallet capabilities (wallets/signing), which match the Direct Financial Execution criteria.

Issues (3)

W011
MEDIUM

Third-party content exposure detected (indirect prompt injection risk).

W012
MEDIUM

Unverifiable external dependency detected (runtime URL that controls agent).

W009
MEDIUM

Direct money access capability detected (payment gateways, crypto, banking).

Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 25, 2026, 07:53 PM
Issues
3