The Agent Skills Directory

[PROMPT_INJECTION] (MEDIUM): The skill processes untrusted data from external websites via fetch_page.py and extract_text.py. This creates an indirect prompt injection surface where malicious instructions embedded in scraped HTML or text could influence the agent's subsequent reasoning or actions.
Ingestion points: URL content fetched by fetch_page.py, extract_links.py, and extract_text.py.
Boundary markers: Absent. There are no instructions to the agent to treat scraped content as data rather than instructions.
Capability inventory: Network access (requests), file writing (--output flag), and potential for the agent to use extracted text for decision making.
Sanitization: Absent. The skill extracts raw HTML or text without filtering for potential injection patterns.
[DATA_EXFILTRATION] (LOW): The skill performs network operations to arbitrary URLs. While this is the intended purpose of a web scraper, network access is a prerequisite for data exfiltration.
[EXTERNAL_DOWNLOADS] (LOW): The skill requires the installation of external Python packages (requests, beautifulsoup4). These are standard and trusted libraries, though no versions are pinned in the requirements.

web-scraper