sustainability-fulltext-fetch

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • Indirect Prompt Injection (HIGH): The skill is designed to scrape full-text content from arbitrary URLs using trafilatura and store it in a persistent SQLite database (entry_content.content_text).
  • Ingestion points: Webpage fallback path and API metadata (OpenAlex/Semantic Scholar) ingest external, attacker-controllable strings.
  • Boundary markers: There are no defined boundary markers or instructions to downstream agents to ignore embedded commands within the fetched content.
  • Capability inventory: The skill has write access to a local database and network access to fetch data. The resulting data is explicitly intended for downstream 'content enrichment' for an agent, meaning malicious instructions in a scraped webpage could hijack the agent's logic.
  • Sanitization: No sanitization or filtering of the content_text is mentioned or implemented in the provided configuration/rules.
  • Command Execution & Missing Code (MEDIUM): The skill relies entirely on scripts/fulltext_fetch.py for its logic (database interaction, API calls, and scraping). This script is not provided in the analyzed files.
  • Risk: Without the script source, it is impossible to verify if the SQLite operations are vulnerable to SQL injection or if the URL fetching logic is vulnerable to Server-Side Request Forgery (SSRF).
  • Credentials Handling (LOW): The skill accepts sensitive inputs like S2_API_KEY and OPENALEX_EMAIL via environment variables or CLI arguments. While standard practice, these are sensitive credentials handled by an unaudited script.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 11:08 AM