news-extractor

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCREDENTIALS_UNSAFE
Full Analysis
  • [CREDENTIALS_UNSAFE] (HIGH): Hardcoded session cookie detected in scripts/crawlers/toutiao.py.
  • Evidence: The variable FIXED_COOKIE contains a large, valid-looking session string including passport_auth_status_ss, ssid_ucp_sso_v1, ttwid, and toutiao_sso_user_ss. Hardcoding credentials in source code exposes them to anyone with access to the skill files.
  • [DATA_EXPOSURE_AND_EXFILTRATION] (MEDIUM): The skill performs network requests to various Chinese news platforms using user-provided URLs.
  • Evidence: Crawlers in scripts/crawlers/ use RequestsFetcher and CurlCffiFetcher to fetch HTML from external domains. While the domains are consistent with the skill's purpose, there is a risk of SSRF or data leakage if the agent is coerced into fetching internal or sensitive URLs that match the platform regex patterns.
  • [INDIRECT_PROMPT_INJECTION] (MEDIUM): The skill ingests untrusted content from the web and formats it for agent consumption.
  • Evidence:
  • Ingestion points: extract_news.py and crawlers in scripts/crawlers/ ingest HTML content from external news URLs.
  • Boundary markers: No explicit boundary markers or instructions to the agent to ignore embedded commands are present in the output generation logic (formatter.py).
  • Capability inventory: The skill can write files to the local file system (save_as_json in base.py and extract_news.py).
  • Sanitization: Content is parsed for text and media but not sanitized for instruction-like patterns. If an attacker controls a news article, they could embed malicious instructions that the agent might execute when processing the extracted Markdown/JSON.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 08:26 AM