benchmark-worker
Warn
Audited by Snyk on Mar 25, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.90). The worker polls the public Benchmark API (https://tapis1.awp.sh) and GET /api/v1/benchmark-sets and /api/v1/poll responses (and even a remote raw GitHub URL for updates) are ingested at runtime and server-provided fields such as assignment["prompt"] and benchmark set descriptions are directly prepended into the LLM prompts (see build_answer_prompt / build_question_prompt and the signed_request/_handle_ask/_handle_answer flows), so untrusted third-party content can carry instructions that materially influence agent behavior.
MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).
- Potentially malicious external URL detected (high risk: 0.90). The skill calls the Benchmark API at https://tapis1.awp.sh at runtime (e.g., /api/v1/poll) and the code explicitly prepends any server-provided "prompt" from the API into LLM prompts, so remote content from that URL can directly control agent instructions and is a required runtime dependency.
MEDIUM W009: Direct money access capability detected (payment gateways, crypto, banking).
- Direct money access detected (high risk: 1.00). The skill explicitly interacts with crypto wallets and signing utilities: it auto-detects and uses a wallet address, calls awp-wallet (awp-wallet receive / init / unlock), references WALLET_PASSWORD, AWP_SESSION_TOKEN, private keys/mnemonics, targets BSC network (testnet), and uses a signing helper ({baseDir}/scripts/benchmark-sign.sh) for API calls. It is specifically designed to earn token rewards on a blockchain subnet (AWP) and performs wallet-related operations and request signing. Although it says it's not for wallet transfers, the presence of wallet init/unlock and signing utilities constitutes explicit crypto/wallet capabilities (wallets/signing), which match the Direct Financial Execution criteria.
Issues (3)
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
W012
MEDIUMUnverifiable external dependency detected (runtime URL that controls agent).
W009
MEDIUMDirect money access capability detected (payment gateways, crypto, banking).
Audit Metadata