NYC

model-pruning

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSCOMMAND_EXECUTION
Full Analysis
  • EXTERNAL_DOWNLOADS (HIGH): The skill provides instructions to download code from an untrusted GitHub organization.
  • Evidence: git clone https://github.com/locuslab/wanda in references/wanda.md.
  • COMMAND_EXECUTION (HIGH): The skill provides commands to execute logic from the downloaded repository, creating a 'download then execute' pattern.
  • Evidence: python main.py --model meta-llama/Llama-2-7b-hf ... in references/wanda.md.
  • INDIRECT_PROMPT_INJECTION (LOW): The skill ingests untrusted data from an external dataset (C4) which is used to calibrate model weights, potentially allowing for adversarial influence on the model state.
  • Ingestion points: load_dataset("allenai/c4", ...) in references/wanda.md.
  • Boundary markers: Absent.
  • Capability inventory: Modifies model weights via W.data *= mask.float().
  • Sanitization: Absent.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 05:57 PM