skills/tao3k/omni-dev-fusion/crawl4ai/Gen Agent Trust Hub

crawl4ai

Pass

Audited by Gen Agent Trust Hub on Mar 29, 2026

Risk Level: SAFEDATA_EXFILTRATIONPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: Indirect prompt injection surface identified in the LLM-based chunk planning workflow.
  • Ingestion points: External web content retrieved from the user-provided URL in 'scripts/engine.py' is parsed into a skeleton and passed to an LLM in 'scripts/crawl_url.py'.
  • Boundary markers: Structural (headers); no explicit 'ignore embedded instructions' delimiters are used when passing the document skeleton to the LLM planner.
  • Capability inventory: The skill uses the 'run_skill_command' API to execute subprocesses in an isolated environment.
  • Sanitization: No sanitization is performed on headers or content retrieved from the web before being processed by the LLM.
  • [DATA_EXFILTRATION]: Local file access capability via the 'file://' protocol support in 'scripts/engine.py'.
  • The '_try_local_file_fast_path' function allows the crawler to read local files if provided with a 'file://' URL. While this is a documented feature intended for local document processing and testing, it represents a potential data exposure surface if an agent is coerced into accessing sensitive system files.
  • [COMMAND_EXECUTION]: Employs the 'Foundation Isolation Pattern' to execute crawling tasks.
  • The 'crawl_url' command in 'scripts/crawl_url.py' uses 'run_skill_command' to invoke 'scripts/engine.py' within a separate 'uv' environment. This provides a layer of isolation for heavy dependencies like Playwright and crawl4ai.
  • [SAFE]: Dynamic module loading in 'scripts/crawl_url.py' is used for internal architectural purposes.
  • The use of 'importlib.import_module' in the '_resolve_engine_helpers' function is limited to loading the local 'engine.py' module within the skill's own package, which is a standard Python practice.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 29, 2026, 11:42 PM