nnsight-remote-interpretability

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • EXTERNAL_DOWNLOADS (MEDIUM): The documentation instructs users to install the 'nnsight' package via pip. This is an unverified dependency from a non-whitelisted source (ndif-team/nnsight).
  • REMOTE_CODE_EXECUTION (HIGH): The library includes a remote execution feature ('remote=True') that serializes model traces and interventions, sending them to external servers at 'ndif.us' for execution.
  • COMMAND_EXECUTION (MEDIUM): The 'trace()' context and 'Proxy' objects facilitate a deferred execution environment where model internals and tensor operations are dynamically modified.
  • PROMPT_INJECTION (HIGH): The skill exposes a critical vulnerability surface for indirect prompt injection by combining untrusted data ingestion with powerful side-effect capabilities. 1. Ingestion points: 'LanguageModel()' loads external model weights and 'model.trace(prompt)' processes untrusted text input. 2. Boundary markers: Absent; no delimiters are suggested to separate instructions from data. 3. Capability inventory: Includes remote code execution on NDIF servers and the ability to modify model activations in-place. 4. Sanitization: Absent; no validation or filtering of input content is implemented before interpolation into the trace context.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 06:54 AM