fine-tuning-with-trl

Warn

Audited by Snyk on Apr 27, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 0.90). The skill explicitly loads public, potentially user-generated datasets (e.g., datasets.load_dataset calls such as "trl-lib/Capybara", "openai/gsm8k", "trl-lib/ultrafeedback_binarized", "trl-lib/tldr" and CLI dataset flags) in SKILL.md and templates/basic_grpo_training.py and then ingests those datasets into trainers and reward functions, so untrusted third-party content is read and can directly influence training decisions and subsequent tool actions.

MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).

  • Potentially malicious external URL detected (high risk: 0.70). The skill includes runtime calls that fetch Hugging Face datasets which directly supply training prompts (e.g., load_dataset("openai/gsm8k") fetches remote dataset content used as prompts at runtime: https://huggingface.co/datasets/openai/gsm8k).

Issues (2)

W011
MEDIUM

Third-party content exposure detected (indirect prompt injection risk).

W012
MEDIUM

Unverifiable external dependency detected (runtime URL that controls agent).

Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 27, 2026, 07:08 AM
Issues
2