llm-api-benchmark
Pass
Audited by Gen Agent Trust Hub on Mar 6, 2026
Risk Level: SAFEDATA_EXFILTRATIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [DATA_EXFILTRATION]: The script
scripts/parse-claude-logs.pyaccesses and reads sensitive data from~/.claude/logs, which contains the agent's local session history, debug information, and potentially sensitive request/response data. - [DATA_EXFILTRATION]: The script
scripts/benchmark.pyretrieves sensitive credentials (e.g.,ANTHROPIC_API_KEY,OPENAI_API_KEY) from environment variables and transmits them in cleartext headers to LLM provider endpoints to facilitate performance benchmarking. - [EXTERNAL_DOWNLOADS]: The skill performs outbound network requests to various external LLM provider domains (e.g., api.anthropic.com, api.openai.com) to measure latency, TTFT, and throughput metrics.
- [COMMAND_EXECUTION]: The workflow involves the execution of local Python scripts and the invocation of subagents using the
Agenttool to perform timing tasks and result comparisons.
Audit Metadata