gcp-agent-eval-engine-runner
SKILL.md
gcp-agent-eval-engine-runner
This skill provides the "engine" for your automated evaluation pipeline. Grounded in evaluation_blog.md, it handles the complexity of running hundreds of parallel requests against a shadow revision while capturing the full "Thinking Process" (Reasoning Trace).
Usage
Ask Antigravity to:
- "Create an evaluation runner script for my agent"
- "Implement parallel inference for my golden dataset"
- "Capture SSE traces for tool trajectory evaluation"
Engine Pattern
- Parallel Inference: Uses
asyncio.Semaphoreto throttle requests (preventing DDOS of the shadow service). - SSE Capture: Connects to the ADK
POST /run_sseendpoint to stream intermediate events. - Dataset Enrichment: Appends
responseandintermediate_eventsto the input dataset. - Vertex AI Integration: Submits the enriched dataset to the
create_evaluation_runAPI.
Python Boilerplate
Refer to scripts/evaluate_agent_boilerplate.py for the core implementation.
Weekly Installs
1
Repository
googlecloudplat…el-demosGitHub Stars
251
First Seen
4 days ago
Security Audits
Installed on
mcpjam1
claude-code1
junie1
windsurf1
zencoder1
crush1