The Agent Skills Directory

REMOTE_CODE_EXECUTION (HIGH): The Node.js scripts compare_agents.js, evaluate_with_langsmith.js, generate_test_cases.js, and run_trajectory_eval.js utilize dynamic import() to load and execute code from absolute file paths provided as command-line arguments. This behavior allows for arbitrary code execution if an attacker can control the path argument through prompt injection.
PROMPT_INJECTION (HIGH): Per Category 8 (Indirect Prompt Injection), the skill ingests untrusted content from local JSON datasets or remote LangSmith repositories. These inputs are passed directly to the agent functions being evaluated without sanitization or boundary markers, allowing malicious entries in a dataset to override agent instructions.
COMMAND_EXECUTION (MEDIUM): The skill relies on and encourages the execution of shell commands (uv run, node) that take variable file paths and module names as input, which can be exploited for argument injection or unauthorized file execution.
DATA_EXFILTRATION (LOW): The skill is designed to interact with LangSmith, an external service, to upload datasets and experiment results. While intended, this establishes a network data flow that could be abused to exfiltrate sensitive information if improperly configured.

langgraph-testing-evaluation