The Agent Skills Directory

[SAFE]: The skill provides a robust implementation of web crawling capabilities using the Crawl4AI SDK. All analyzed files, including documentation and scripts, serve the stated purpose of building web data pipelines. No suspicious code or malicious intent was observed.
[PROMPT_INJECTION]: The skill possesses a data ingestion surface for indirect prompt injection as it crawls and extracts content from arbitrary external URLs. 1. Ingestion points: AsyncWebCrawler.arun() calls in scripts/basic_crawler.py, scripts/batch_crawler.py, and scripts/extraction_pipeline.py. 2. Boundary markers: No explicit security delimiters or 'ignore' warnings are used when processing crawled content. 3. Capability inventory: Includes network requests via Crawl4AI, local file writing (e.g., output.md, batch_results.json), and subprocess execution for local testing. 4. Sanitization: While the skill employs quality-based filtering strategies (BM25, Pruning), it does not explicitly sanitize extracted markdown or HTML for adversarial prompt content. This is a common characteristic of crawling tools and is documented here as an inherent risk factor.
[COMMAND_EXECUTION]: The tests/run_all_tests.py script uses subprocess.run to execute local Python test files (test_basic_crawling.py, etc.). This is a standard implementation for a test runner and does not involve untrusted input or shell injection risks.
[EXTERNAL_DOWNLOADS]: The skill performs network operations to crawl external websites provided by the user. These requests are handled by the Crawl4AI library for the purpose of data extraction. There is no evidence of the skill downloading and executing remote scripts or binaries.

web-crawl4ai