sparse-autoencoder-training

Warn

Audited by Snyk on Mar 28, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 1.00). This skill explicitly loads pre-trained SAEs from public third-party sources (e.g., SAE.from_pretrained with HuggingFace "username/repo-name" releases and Neuronpedia feature pages) and then uses those loaded SAEs to encode activations and steer/alter model generation, so untrusted user-provided model artifacts can directly influence tool behavior.

Issues (1)

W011
MEDIUM

Third-party content exposure detected (indirect prompt injection risk).

Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 28, 2026, 06:07 PM
Issues
1