eval-harness

Warn

Audited by Snyk on Mar 13, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W013: Attempt to modify system services in skill instructions.

  • Attempt to modify system services in skill instructions detected (high risk: 0.80). Flagged because the eval definitions explicitly include "可以创建新用户账户" (can create new user accounts), which could instruct the agent to create OS-level accounts and thus modify machine state requiring elevated privileges.

Issues (1)

W013
MEDIUM

Attempt to modify system services in skill instructions.

Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 13, 2026, 12:38 AM
Issues
1