nnsight-remote-interpretability
Pass
Audited by Gen Agent Trust Hub on Mar 28, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill provides instructional content and code examples for using nnsight, an established tool for mechanistic interpretability research.
- [SAFE]: Guidance for handling API keys (e.g., NDIF_API_KEY) follows security best practices by recommending environment variables or configuration objects instead of hardcoding secrets.
- [EXTERNAL_DOWNLOADS]: The skill references the NDIF (Neural Design and Interfacing) remote execution service at ndif.us, which is an intended and documented functionality of the nnsight library for analyzing large-scale models.
- [COMMAND_EXECUTION]: The provided Python code snippets use the nnsight tracing API to access and modify neural network activations, which is standard behavior for the intended research use case.
Audit Metadata