datasets
Pass
Audited by Gen Agent Trust Hub on Apr 28, 2026
Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill utilizes several CLI commands to perform its discovery and delivery phases.
- It runs
langwatch prompt list,langwatch trace search, andlangwatch trace getto gather context about the AI system being evaluated. - It executes
git logto analyze commit history for potential failure modes. - It uses
langwatch dataset createandlangwatch dataset uploadto manage the resulting data. - It also instructs the agent to write and execute a local Python or Node.js script to properly format CSV data, which is a standard procedure for ensuring data integrity.
- [DATA_EXFILTRATION]: The skill is designed to collect local information (codebase structure, prompt definitions) and production data (traces) to generate synthetic datasets. These datasets are then uploaded to the LangWatch platform using the
langwatch dataset uploadcommand. As the skill and the CLI tool are provided by the same vendor (LangWatch), this represents the core intended functionality of the service. - [EXTERNAL_DOWNLOADS]: The skill fetches data from the LangWatch platform, including production traces and existing dataset information, via the
langwatchCLI during the discovery phase. - [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface as it processes data from untrusted external sources.
- Ingestion points: The skill reads production traces via
langwatch trace get, local codebase files, and user-provided documents like PDFs. - Boundary markers: The instructions do not specify the use of delimiters or 'ignore' instructions when processing the content of these external files or traces.
- Capability inventory: The skill has the ability to execute shell commands (via
langwatchandgit), write local files (CSV/scripts), and perform network uploads (via the CLI). - Sanitization: There are no explicit instructions to sanitize or escape the content retrieved from traces or external documents before it is used to generate the dataset. However, per security guidelines, this finding is classified as low risk.
Audit Metadata