nnsight-remote-interpretability

Pass

Audited by Gen Agent Trust Hub on Mar 28, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill provides instructional content and code examples for using nnsight, an established tool for mechanistic interpretability research.
  • [SAFE]: Guidance for handling API keys (e.g., NDIF_API_KEY) follows security best practices by recommending environment variables or configuration objects instead of hardcoding secrets.
  • [EXTERNAL_DOWNLOADS]: The skill references the NDIF (Neural Design and Interfacing) remote execution service at ndif.us, which is an intended and documented functionality of the nnsight library for analyzing large-scale models.
  • [COMMAND_EXECUTION]: The provided Python code snippets use the nnsight tracing API to access and modify neural network activations, which is standard behavior for the intended research use case.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 28, 2026, 06:06 PM