skills/fadeloo/skills/ai-tech-rss-fetch

ai-tech-rss-fetch

Originally fromtiangong-ai/skills
SKILL.md

AI Tech RSS Fetch

Core Goal

  • Subscribe to RSS/Atom sources.
  • Persist feed and entry metadata to SQLite.
  • Deduplicate entries with layered identity keys plus content fingerprints.
  • Keep only metadata; do not fetch full article bodies and do not summarize.

Triggering Conditions

  • Receive a request to subscribe RSS feeds from URLs or OPML.
  • Receive a request to run incremental RSS sync reliably.
  • Need stable metadata persistence for downstream processing.
  • Need dedupe-safe storage of feed items over repeated runs.

Workflow

  1. Prepare runtime and database.
  • Ensure dependency is installed: python3 -m pip install feedparser.
  • In multi-agent runtimes, pin DB to an absolute path before any command:
export AI_RSS_DB_PATH="/absolute/path/to/workspace-rss-bot/ai_rss.db"
  • Initialize SQLite schema once:
python3 scripts/rss_subscribe.py init-db --db "$AI_RSS_DB_PATH"
  1. Add feed subscriptions.
  • Add one feed URL:
python3 scripts/rss_subscribe.py add-feed --db "$AI_RSS_DB_PATH" --url "https://example.com/feed.xml"
  • Import feeds from OPML:
python3 scripts/rss_subscribe.py import-opml --db "$AI_RSS_DB_PATH" --opml assets/hn-popular-blogs-2025.opml
  1. Run incremental sync.
  • Fetch active feeds and store metadata:
python3 scripts/rss_subscribe.py sync --db "$AI_RSS_DB_PATH" --max-feeds 20 --max-items-per-feed 100
  • Optional one-feed sync:
python3 scripts/rss_subscribe.py sync --db "$AI_RSS_DB_PATH" --feed-url "https://example.com/feed.xml"
  1. Query persisted metadata.
  • List feeds:
python3 scripts/rss_subscribe.py list-feeds --db "$AI_RSS_DB_PATH" --limit 50
  • List recent entries:
python3 scripts/rss_subscribe.py list-entries --db "$AI_RSS_DB_PATH" --limit 100

Input Requirements

  • Supported inputs:
    • RSS XML feed URLs.
    • OPML feed list files.

Output Contract (Metadata Only)

  • Persist feeds metadata to SQLite:
    • feed_url, feed_title, site_url, etag, last_modified, status fields.
  • Persist entries metadata to SQLite:
    • id, dedupe_key (compat primary identity snapshot), guid, url, canonical_url, title, author, published_at, updated_at, summary, categories, content_hash, match_confidence, timestamps.
  • Persist entry_identities mapping table to SQLite:
    • entry_id, key_type, key_value, created_at.
    • Supported key types: guid, canonical_url, legacy_guid, fallback_hash.
  • Do not store generated summaries and do not create archive markdown files.

Configurable Parameters

  • db_path
  • AI_RSS_DB_PATH (recommended absolute path in multi-agent runtime)
  • opml_path
  • feed_urls
  • max_feeds_per_run
  • max_items_per_feed
  • user_agent
  • seen_ttl_days
  • enable_conditional_get
  • Example config: assets/config.example.json

Error and Boundary Handling

  • Feed HTTP/network failure: keep syncing other feeds and record last_error.
  • Feed 304 Not Modified: skip entry parsing and keep state.
  • Missing guid and link: use hashed fallback identity and set match_confidence=low.
  • Dependency missing (feedparser): return install guidance.

Final Output Checklist (Required)

  • core goal
  • trigger conditions
  • input requirements
  • metadata schema
  • dedupe and sync rules
  • command workflow
  • configurable parameters
  • error handling

Use the following simplified checklist verbatim when the user requests it:

核心目标
输入需求
触发条件
元数据模型
去重与同步规则
命令流程
可配置参数
错误处理

References

  • references/input-model.md
  • references/output-rules.md
  • references/time-range-rules.md

Assets

  • assets/hn-popular-blogs-2025.opml (candidate feed pool)
  • assets/config.example.json

Scripts

  • scripts/rss_subscribe.py
Weekly Installs
7
Repository
fadeloo/skills
First Seen
Feb 24, 2026
Installed on
opencode7
github-copilot7
codex7
amp7
kimi-cli7
openclaw7