The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: Automated dependency installation. The ensure_dependencies function in scripts/scrape.py uses subprocess.run to call pip install for the trafilatura and requests libraries if they are not detected in the environment.
[COMMAND_EXECUTION]: Unrestricted file system write access. The --output parameter in scripts/scrape.py allows the user to specify any destination path. The script uses os.path.abspath and os.path.expanduser before writing content to that path, which could be used to overwrite sensitive files or system configurations if the agent has appropriate permissions.
[DATA_EXFILTRATION]: Network access to external domains. The skill uses the requests library to fetch data from arbitrary URLs provided in the input. While this is the primary function of the skill, it establishes network connections to non-whitelisted external servers.
[PROMPT_INJECTION]: Indirect prompt injection surface.
Ingestion points: Remote web content fetched via fetch_url in scripts/scrape.py.
Boundary markers: Absent. No delimiters or warnings are provided to the agent to treat the fetched content as untrusted data.
Capability inventory: The skill can write to the local file system and execute shell commands via subprocess.run (for pip).
Sanitization: While the trafilatura library extracts and cleans the text from HTML, it does not sanitize the semantic content for malicious natural language instructions.

web-scraper