evaluating-llms
Warn
Audited by Socket on Mar 18, 2026
1 alert found:
AnomalyAnomalyexamples/python/unit_evaluation.py
LOWAnomalyLOW
examples/python/unit_evaluation.py
The code is a non-malicious test harness that exercises OpenAI API capabilities for several NLP tasks. Primary concerns are privacy/data leakage due to external API calls, reliance on live API responses in tests, and potential API parameter compatibility issues. Recommend replacing live API calls with mocks/stubs for unit tests, adding input redaction/minimization, validating API parameter usage against the current library, and documenting data handling policies. Overall security risk remains moderate due to data transmission to an external service in a testing context.
Confidence: 77%Severity: 60%
Audit Metadata