The Agent Skills Directory

[PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it instructs the agent to systematically fetch and parse HTML content from external websites.
Ingestion points: Web page content fetched during the crawling process (Phase 2, Step 4 in SKILL.md).
Boundary markers: Absent. There are no instructions provided to the agent to treat extracted content strictly as data or to ignore embedded natural language commands found within the HTML.
Capability inventory: The skill utilizes network access (requests.get), database persistence (PostgreSQL), and shared state management (Redis).
Sanitization: While the skill includes robust URL normalization and trap detection (found in references/url-normalization.md), it lacks specific sanitization or filtering for the natural language content extracted from pages.
[DATA_EXFILTRATION]: The skill performs automated network requests to fetch external resources.
Fetches robots.txt from remote domains to ensure compliance with crawling policies.
Fetches full HTML content from discovered URLs within the defined scope.
These operations are documented as part of the core functionality and do not target sensitive local files or credentials.
[EXTERNAL_DOWNLOADS]: The skill references several external libraries for implementation.
Includes code snippets utilizing the requests library for HTTP communication and idna for internationalized domain name handling.
References standard database and caching clients (PostgreSQL, Redis) for state management.
[COMMAND_EXECUTION]: The skill provides Python implementations for crawler logic that interact with external services.
Includes Redis atomic operations (SADD, RPUSH, HSET, BRPOPLPUSH) for URL frontier management and rate limiting.
Includes SQL schema definitions and execution patterns for storing results in a persistent database.

algo-seo-crawl