langsmith-code-eval
Pass
Audited by Gen Agent Trust Hub on Feb 18, 2026
Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
- [Command Execution] (LOW): The skill instructs the agent to execute internal Python scripts (
inspect_trace.pyandinspect_dataset.py) and encourages the creation and execution of custom evaluation code. This is expected behavior for a developer-focused skill. - [Indirect Prompt Injection] (LOW): The skill ingests data from LangSmith datasets which could contain untrusted content.
- Ingestion points:
scripts/inspect_dataset.pyreads data from the LangSmith API. - Boundary markers: None; external data is printed directly to the terminal.
- Capability inventory: The skill allows for local script execution and network access to LangSmith APIs.
- Sanitization: No sanitization or escaping is performed on the data fetched from the dataset before it is displayed or used.
Audit Metadata