skills/bmsuisse/skills/autoresearch/Gen Agent Trust Hub

autoresearch

Pass

Audited by Gen Agent Trust Hub on Apr 26, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill is designed to run arbitrary shell commands provided by the user (METRIC_COMMAND) in an autonomous loop to measure performance metrics. While powerful, this behavior is central to the skill's functionality and is preceded by an interactive setup phase where the user must explicitly confirm the command.
  • [EXTERNAL_DOWNLOADS]: The instructions suggest that the user or agent install well-known benchmarking and testing utilities such as 'hyperfine', 'memory_profiler', 'pytest', and 'databricks-connect' using standard package managers (brew, pip). These are standard tools within the specified domains (Python, ML, Spark).
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it ingests untrusted data from the repository being optimized and from web search results.
  • Ingestion points: The agent reads local project files (Python, SQL, ML models) and external documentation or techniques via web search to generate optimization hypotheses.
  • Boundary markers: The skill does not implement specific boundary markers or XML tags to isolate file content or external search results when they are interpolated into the THINK/THINKER steps.
  • Capability inventory: The skill can execute shell commands, perform git operations (commit, restore, worktree), and modify project source code.
  • Sanitization: There is no specific sanitization or filtering logic applied to external data; however, the skill includes an 'Inspector' phase that requires the agent to validate if the change is understandable and maintainable before keeping it.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 26, 2026, 09:33 AM