perf-test-flagos
Pass
Audited by Gen Agent Trust Hub on Mar 26, 2026
Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: The skill uses
docker execto manage processes within containers, such as starting servers, checking health, and running benchmark scripts. - [COMMAND_EXECUTION]: Python scripts utilize
subprocess.run()with argument lists to invoke thevllm bench serveCLI tool. This is a secure implementation practice that avoids shell injection vulnerabilities. - [SAFE]: The skill communicates with
localhost:8000to verify server status and retrieve model IDs. These operations are local and do not involve untrusted external domains or data exfiltration. - [SAFE]: A detection regarding piping curl output to python was determined to be a false positive; the implementation uses
python3 -cwith a hardcoded script to parse JSON data from the local benchmark service. - [SAFE]: No sensitive information disclosure, credential harvesting, or persistence attempts were identified in the codebase.
Audit Metadata