scrape-webpage
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [Indirect Prompt Injection] (LOW): The skill ingests untrusted content from external websites, creating an attack surface if downstream skills process the scraped data without proper sanitization.
- Ingestion points: Webpage HTML, metadata, and images are retrieved via Playwright as described in
SKILL.mdand implemented inscripts/image-capture.js. - Boundary markers: Absent. There are no delimiters or instructions to ignore embedded commands in the saved
metadata.jsonorcleaned.html. - Capability inventory: The skill uses
fs.writeFile,fs.mkdir, andfs.mkdirSyncacrossscripts/image-capture.jsandscripts/generate-path.js. - Sanitization:
scripts/generate-path.jsprovides robust path sanitization using regex to replace non-alphanumeric characters, but no content sanitization is implemented for the scraped data itself. - [External Downloads] (LOW): The skill requires external dependencies and downloads content from the web during operation.
- Evidence:
SKILL.mdinstructs users to runnpm install playwrightandsharp.scripts/image-capture.jsdownloads image buffers from remote URLs during page navigation. Per [TRUST-SCOPE-RULE], these are considered low risk as they are necessary for the primary purpose and use trusted libraries (Playwright is maintained by Microsoft). - [Command Execution] (LOW): The skill requires the user to execute shell commands for setup and execution.
- Evidence:
SKILL.mdreferences the use ofnpx playwright installand running the scraping logic vianode. - [Missing Logic] (LOW): The primary orchestration script
scripts/analyze-webpage.jsis documented as the main entry point but is not included in the provided file list, which limits the ability to perform a full audit of the execution flow.
Audit Metadata