gcp-agent-tool-trajectory-evaluator
SKILL.md
gcp-agent-tool-trajectory-evaluator
This skill provides the specialized Python logic needed to evaluate how an agent uses its tools. Grounded in evaluation_blog.md, it moves beyond "Did the tool run?" to "Were the tools used correctly and efficiently?"
Usage
Ask Antigravity to:
- "Implement Trajectory Precision and Recall metrics"
- "Set up an Order Match metric for my multi-step agent"
- "Add a custom trajectory scorer to my Vertex AI evaluation"
Metric Definitions
- Trajectory Precision: Measures what percentage of called tools were actually specified in the reference.
- Trajectory Recall: Measures what percentage of required tools were successfully called by the agent.
- In-Order Match: Checks if the required tools were called in the correct sequence (even if other non-essential tools were called in between).
implementation Pattern
Refer to scripts/trajectory_metrics.py. These functions are designed to be serialized and passed to Vertex AI via CustomCodeExecutionSpec.
Weekly Installs
1
Repository
googlecloudplat…el-demosGitHub Stars
251
First Seen
4 days ago
Security Audits
Installed on
mcpjam1
claude-code1
junie1
windsurf1
zencoder1
crush1