pentest-network-internal

Fail

Audited by Gen Agent Trust Hub on Feb 18, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION] (HIGH): The execute_command function in references/tools.md allows the agent to run arbitrary shell commands. This provides a direct path for an attacker or a hallucinating agent to perform unauthorized actions beyond the intended scope of the pentest.
  • [REMOTE_CODE_EXECUTION] (HIGH): The execute_python_script and install_python_package tools allow for the execution of arbitrary Python code and the installation of third-party libraries. This could be used to download and execute malicious payloads or persistence mechanisms.
  • [DATA_EXFILTRATION] (HIGH): The skill includes tools for credential harvesting (responder_credential_harvest, hydra_attack, john_crack) and provides the functional capability (via shell and Python) to transmit stolen credentials or sensitive files to external systems.
  • [DATA_EXPOSURE] (HIGH): The workspace utilities (create_file, modify_file, delete_file, list_files) grant the agent broad access to the file system, allowing it to read or modify sensitive configurations and data.
  • [PROMPT_INJECTION] (LOW): The skill is vulnerable to Indirect Prompt Injection (Category 8). It ingests untrusted data from network services (e.g., via nmap_scan, httpx_probe, and responder).
  • Ingestion points: Network service banners, HTTP response headers, and NetBIOS names.
  • Boundary markers: None identified in the workflow or tool definitions to prevent the LLM from following instructions embedded in scan results.
  • Capability inventory: Full shell access (execute_command), Python execution, and file system modification.
  • Sanitization: No evidence of output sanitization or validation before the data is returned to the agent context.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 18, 2026, 05:57 PM