scrape-webpage

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [Indirect Prompt Injection] (LOW): The skill ingests untrusted content from external websites, creating an attack surface if downstream skills process the scraped data without proper sanitization.
  • Ingestion points: Webpage HTML, metadata, and images are retrieved via Playwright as described in SKILL.md and implemented in scripts/image-capture.js.
  • Boundary markers: Absent. There are no delimiters or instructions to ignore embedded commands in the saved metadata.json or cleaned.html.
  • Capability inventory: The skill uses fs.writeFile, fs.mkdir, and fs.mkdirSync across scripts/image-capture.js and scripts/generate-path.js.
  • Sanitization: scripts/generate-path.js provides robust path sanitization using regex to replace non-alphanumeric characters, but no content sanitization is implemented for the scraped data itself.
  • [External Downloads] (LOW): The skill requires external dependencies and downloads content from the web during operation.
  • Evidence: SKILL.md instructs users to run npm install playwright and sharp. scripts/image-capture.js downloads image buffers from remote URLs during page navigation. Per [TRUST-SCOPE-RULE], these are considered low risk as they are necessary for the primary purpose and use trusted libraries (Playwright is maintained by Microsoft).
  • [Command Execution] (LOW): The skill requires the user to execute shell commands for setup and execution.
  • Evidence: SKILL.md references the use of npx playwright install and running the scraping logic via node.
  • [Missing Logic] (LOW): The primary orchestration script scripts/analyze-webpage.js is documented as the main entry point but is not included in the provided file list, which limits the ability to perform a full audit of the execution flow.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 06:35 PM