nemo-guardrails

Pass

Audited by Gen Agent Trust Hub on Mar 28, 2026

Risk Level: SAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill documentation includes strings commonly used in prompt injection attacks, such as "Ignore previous instructions", "You are now in developer mode", and "Pretend you are DAN". These strings are used strictly as example patterns for configuring the guardrail to detect and block such attempts. They do not constitute an instruction to the agent.
  • [EXTERNAL_DOWNLOADS]: The skill references the installation of the nemoguardrails Python package and provides links to official documentation and repositories from NVIDIA. These sources are well-known and trusted technology providers.
  • [COMMAND_EXECUTION]: The documentation includes standard package installation commands (pip install). This is typical for technical guides and does not involve arbitrary or malicious command execution.
  • [SAFE]: The skill contains no evidence of data exfiltration, obfuscation, or persistence mechanisms. Its logic is entirely focused on implementing safety filters for LLM applications.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 28, 2026, 06:06 PM