web-archive-scraper
Pass
Audited by Gen Agent Trust Hub on Mar 14, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: Fetches archived snapshots and web content from the Internet Archive (web.archive.org) via the CDX API.
- [PROMPT_INJECTION]: Potential for indirect prompt injection when processing archived web content, as the skill lacks explicit boundary markers to differentiate between data and instructions.
- Ingestion points: Fetches content from external URLs via
fetch_archived_contentinscripts/search_archive.py. - Boundary markers: Absent; extracted text is printed directly to the output.
- Capability inventory: Limited to text extraction and display; no secondary dangerous actions (like
execor file writes) are performed on the fetched content. - Sanitization: Uses regex to strip HTML tags from the retrieved content.
Audit Metadata