pentest-network-internal
Fail
Audited by Gen Agent Trust Hub on Feb 18, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The
execute_commandfunction inreferences/tools.mdallows the agent to run arbitrary shell commands. This provides a direct path for an attacker or a hallucinating agent to perform unauthorized actions beyond the intended scope of the pentest. - [REMOTE_CODE_EXECUTION] (HIGH): The
execute_python_scriptandinstall_python_packagetools allow for the execution of arbitrary Python code and the installation of third-party libraries. This could be used to download and execute malicious payloads or persistence mechanisms. - [DATA_EXFILTRATION] (HIGH): The skill includes tools for credential harvesting (
responder_credential_harvest,hydra_attack,john_crack) and provides the functional capability (via shell and Python) to transmit stolen credentials or sensitive files to external systems. - [DATA_EXPOSURE] (HIGH): The workspace utilities (
create_file,modify_file,delete_file,list_files) grant the agent broad access to the file system, allowing it to read or modify sensitive configurations and data. - [PROMPT_INJECTION] (LOW): The skill is vulnerable to Indirect Prompt Injection (Category 8). It ingests untrusted data from network services (e.g., via
nmap_scan,httpx_probe, andresponder). - Ingestion points: Network service banners, HTTP response headers, and NetBIOS names.
- Boundary markers: None identified in the workflow or tool definitions to prevent the LLM from following instructions embedded in scan results.
- Capability inventory: Full shell access (
execute_command), Python execution, and file system modification. - Sanitization: No evidence of output sanitization or validation before the data is returned to the agent context.
Recommendations
- AI detected serious security threats
Audit Metadata