skills/google/adk-docs/adk-eval-guide/Gen Agent Trust Hub

adk-eval-guide

Pass

Audited by Gen Agent Trust Hub on Mar 9, 2026

Risk Level: SAFECOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • Command Execution: The skill provides instructions for using the adk CLI and make commands to execute evaluation suites. This is the intended interface for triggering the testing workflows described in the guide.
  • Dynamic Metric Loading: The guide demonstrates how to extend the evaluation framework with custom metrics. This involves specifying a Python module path in the configuration, which the framework then loads and executes at runtime. This capability is used to implement specialized evaluation logic, such as multimodal validation using vision models.
  • Evaluation Data Ingestion: The framework processes JSON-based 'eval sets' which contain simulated user interactions and expected outcomes. Since this data is fed into the agent during evaluation, it represents an entry point for testing data. The guide outlines structured schemas for this data to ensure consistent processing.
  • Cloud Service Interaction: The instructions include patterns for connecting to Google Cloud services like Vertex AI and Google Cloud Storage. These integrations use standard environment variables for project configuration and follow established practices for cloud-based AI development.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 9, 2026, 08:46 PM