skill-test

Pass

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: SAFECOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The framework uses subprocess.run to interact with system utilities for legitimate administrative and developer tasks. Specifically, scripts/add_example.py uses pbpaste and xclip for clipboard access, and scripts/run_app_eval.py uses the Databricks CLI for testing application deployments.
  • [REMOTE_CODE_EXECUTION]: The skill is designed to execute code both locally and remotely. It uses mcp__databricks__execute_sql and mcp__databricks__execute_command to verify code blocks on Databricks compute resources. Additionally, src/skill_test/grp/executor.py uses __import__ to dynamically verify the presence of Python libraries.
  • [PROMPT_INJECTION]: The skill processes external data (prompts and agent responses) through LLM-based judges and optimization loops. This constitutes an indirect prompt injection surface where adversarial content in the test data could attempt to influence the evaluation or optimization process. The framework uses template-based boundary markers to mitigate these risks.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 14, 2026, 01:59 PM