gcp-agent-tool-trajectory-evaluator

This skill provides the specialized Python logic needed to evaluate how an agent uses its tools. Grounded in evaluation_blog.md, it moves beyond "Did the tool run?" to "Were the tools used correctly and efficiently?"

Usage

Ask Antigravity to:

"Implement Trajectory Precision and Recall metrics"
"Set up an Order Match metric for my multi-step agent"
"Add a custom trajectory scorer to my Vertex AI evaluation"

Metric Definitions

Trajectory Precision: Measures what percentage of called tools were actually specified in the reference.
Trajectory Recall: Measures what percentage of required tools were successfully called by the agent.
In-Order Match: Checks if the required tools were called in the correct sequence (even if other non-essential tools were called in between).

implementation Pattern

Refer to scripts/trajectory_metrics.py. These functions are designed to be serialized and passed to Vertex AI via CustomCodeExecutionSpec.

Related skills

More from googlecloudplatform/devrel-demos

Installs

Repository

googlecloudplat…el-demos

GitHub Stars

281

First Seen

Mar 11, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass

gcp-agent-tool-trajectory-evaluator

gcp-agent-tool-trajectory-evaluator

Usage

Metric Definitions

implementation Pattern

More from googlecloudplatform/devrel-demos

go-backend-dev

go-reviewer

go-architect

go-test-expert

latest-software-version

go-project-setup