llm-api-benchmark

Pass

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: SAFEDATA_EXFILTRATIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [DATA_EXFILTRATION]: The script scripts/parse-claude-logs.py accesses and reads sensitive data from ~/.claude/logs, which contains the agent's local session history, debug information, and potentially sensitive request/response data.
  • [DATA_EXFILTRATION]: The script scripts/benchmark.py retrieves sensitive credentials (e.g., ANTHROPIC_API_KEY, OPENAI_API_KEY) from environment variables and transmits them in cleartext headers to LLM provider endpoints to facilitate performance benchmarking.
  • [EXTERNAL_DOWNLOADS]: The skill performs outbound network requests to various external LLM provider domains (e.g., api.anthropic.com, api.openai.com) to measure latency, TTFT, and throughput metrics.
  • [COMMAND_EXECUTION]: The workflow involves the execution of local Python scripts and the invocation of subagents using the Agent tool to perform timing tasks and result comparisons.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 6, 2026, 03:13 AM