pw-pair-programming

Fail

Audited by Gen Agent Trust Hub on Feb 23, 2026

Risk Level: HIGHCREDENTIALS_UNSAFEDATA_EXFILTRATIONCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [CREDENTIALS_UNSAFE] (HIGH): The file scripts/pp/download.nu contains logic to programmatically extract the user's ChatGPT session token. It executes JavaScript via pw eval to call /api/auth/session and retrieve the accessToken. This token is then stored in a variable and used for subsequent authenticated API requests.
  • Evidence: let token_js = "(() => { const xhr = new XMLHttpRequest(); xhr.open('GET', '/api/auth/session', false); ... return { token: session.accessToken }; })()" in scripts/pp/download.nu.
  • [DATA_EXFILTRATION] (HIGH): The skill is designed to read arbitrary files from the local filesystem and transmit them to a remote browser session. scripts/pp/attachments.nu encodes local files into Base64 and injects them into the browser's DOM, while scripts/pp/compose.nu reads file contents into prompt strings.
  • Evidence: let bytes = (open --raw $file_path | into binary) and base64: ($bytes | encode base64 ...) in scripts/pp/attachments.nu.
  • [COMMAND_EXECUTION] (MEDIUM): The skill relies heavily on pw eval and pw eval-js to execute arbitrary JavaScript within the context of the automated browser. This allows the skill to manipulate the browser session, extract data, and trigger UI actions like clicking buttons.
  • Evidence: Extensive use of pw eval across attachments.nu, download.nu, messaging.nu, and session.nu.
  • [REMOTE_CODE_EXECUTION] (MEDIUM): The skill downloads content from the remote assistant via scripts/pp/download.nu. If an attacker-controlled assistant provides a malicious file that the user is then prompted to execute (a common pattern in pair-programming workflows), it leads to RCE.
  • Evidence: let download_result = (pw eval-js $download_js) and subsequent file saving logic in scripts/pp/download.nu.
  • [PROMPT_INJECTION] (LOW): The skill facilitates indirect prompt injection by reading local source code and assistant responses into the agent's context. If these files contain hidden instructions, they could manipulate the behavior of the 'driver' or 'navigator' roles.
  • Ingestion points: pp brief, pp compose, and pp get-response.
  • Capability inventory: File system read/write, browser automation, network requests via browser.
Recommendations
  • AI detected serious security threats
  • Contains 1 malicious URL(s) - DO NOT USE
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 23, 2026, 06:29 AM