web-scraping

Pass

Audited by Gen Agent Trust Hub on Mar 17, 2026

Risk Level: SAFE
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill instructs the user to install standard industry-standard packages such as apify-cli, crawlee, playwright, and got-scraping via npm. These are well-known libraries for the intended purpose of web scraping and Actor development.
  • [COMMAND_EXECUTION]: The workflow involves using the apify-cli for project initialization (apify create), local testing (apify run), and deployment (apify push). These commands are standard for the Apify platform and are executed by the user to manage their own projects.
  • [DATA_EXFILTRATION]: While the skill involves fetching data from external websites and saving it to Apify datasets (Dataset.pushData), this is the core functional purpose of a web scraping tool. No evidence of unauthorized harvesting or sending sensitive local files (like SSH keys or credentials) was found.
  • [INDIRECT_PROMPT_INJECTION]: As a web scraping tool, this skill inherently processes untrusted data from external websites.
  • Ingestion points: Data is ingested via PlaywrightCrawler, CheerioCrawler, and got-scraping from user-provided URLs.
  • Boundary markers: The skill does not explicitly define boundary markers to prevent the LLM from following instructions embedded in scraped HTML or JSON content.
  • Capability inventory: The skill uses playwright_evaluate and gotScraping to interact with external web content and save results to local or platform-based storage.
  • Sanitization: The skill recommends using structured JSON and regex patterns for URL filtering, which provides some validation of input sources.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 17, 2026, 05:09 PM