modeio-guardrail

Pass

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTION
Full Analysis
  • [SAFE]: The high-risk strings flagged by automated scanners (e.g., curl | sh) are part of the tool's security detection engine and test suites. They are used for signature matching against target repositories and for verifying scanner performance, rather than for execution by the skill itself. These patterns are found in detection files such as modeio_guardrail/skill_safety/constants.py and test cases in tests/test_skill_safety_assessment.py.
  • [EXTERNAL_DOWNLOADS]: The skill makes outbound network requests to https://safety-cf.modeio.ai/api/cf/safety for instruction safety checks and to the GitHub API (api.github.com) to retrieve repository metadata, README files, and issue details. These interactions are required for the tool's core functionality.
  • [COMMAND_EXECUTION]: The skill utilizes subprocess.run to execute local git commands and Python scripts. This is used solely to retrieve technical metadata about the repository being audited (such as commit SHAs or remote URLs) or to perform the local scan analysis.
  • [PROMPT_INJECTION]: The skill includes defensive instructions and detection patterns for prompt injection. Test cases in tests/test_skill_safety_assessment.py contain injection payloads like 'Ignore previous instructions' used for verification of the scanner's efficacy. A robust defensive prompt contract is provided in prompts/static_repo_scan.md to prevent the agent from obeying instructions embedded in audited data.
  • [DATA_EXFILTRATION]: User-provided instructions are transmitted to the vendor's backend for security classification. This behavior is documented and represents the primary purpose of the 'guardrail' feature. The tool also includes a scanner specifically designed to detect and prevent data exfiltration in other repositories.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 6, 2026, 04:24 AM