ai-tech-rss-fetch
AI Tech RSS Fetch
Core Goal
- Subscribe to RSS/Atom sources.
- Persist feed and entry metadata to SQLite.
- Deduplicate entries with layered identity keys plus content fingerprints.
- Keep only metadata; do not fetch full article bodies and do not summarize.
Triggering Conditions
- Receive a request to subscribe RSS feeds from URLs or OPML.
- Receive a request to run incremental RSS sync reliably.
- Need stable metadata persistence for downstream processing.
- Need dedupe-safe storage of feed items over repeated runs.
Workflow
- Prepare runtime and database.
- Ensure dependency is installed:
python3 -m pip install feedparser. - In multi-agent runtimes, pin DB to an absolute path before any command:
export AI_RSS_DB_PATH="/absolute/path/to/workspace-rss-bot/ai_rss.db"
- Initialize SQLite schema once:
python3 scripts/rss_subscribe.py init-db --db "$AI_RSS_DB_PATH"
- Add feed subscriptions.
- Add one feed URL:
python3 scripts/rss_subscribe.py add-feed --db "$AI_RSS_DB_PATH" --url "https://example.com/feed.xml"
- Import feeds from OPML:
python3 scripts/rss_subscribe.py import-opml --db "$AI_RSS_DB_PATH" --opml assets/hn-popular-blogs-2025.opml
- Run incremental sync.
- Fetch active feeds and store metadata:
python3 scripts/rss_subscribe.py sync --db "$AI_RSS_DB_PATH" --max-feeds 20 --max-items-per-feed 100
- Optional one-feed sync:
python3 scripts/rss_subscribe.py sync --db "$AI_RSS_DB_PATH" --feed-url "https://example.com/feed.xml"
- Query persisted metadata.
- List feeds:
python3 scripts/rss_subscribe.py list-feeds --db "$AI_RSS_DB_PATH" --limit 50
- List recent entries:
python3 scripts/rss_subscribe.py list-entries --db "$AI_RSS_DB_PATH" --limit 100
Input Requirements
- Supported inputs:
- RSS XML feed URLs.
- OPML feed list files.
Output Contract (Metadata Only)
- Persist
feedsmetadata to SQLite:feed_url,feed_title,site_url,etag,last_modified, status fields.
- Persist
entriesmetadata to SQLite:id,dedupe_key(compat primary identity snapshot),guid,url,canonical_url,title,author,published_at,updated_at,summary,categories,content_hash,match_confidence, timestamps.
- Persist
entry_identitiesmapping table to SQLite:entry_id,key_type,key_value,created_at.- Supported key types:
guid,canonical_url,legacy_guid,fallback_hash.
- Do not store generated summaries and do not create archive markdown files.
Configurable Parameters
db_pathAI_RSS_DB_PATH(recommended absolute path in multi-agent runtime)opml_pathfeed_urlsmax_feeds_per_runmax_items_per_feeduser_agentseen_ttl_daysenable_conditional_get- Example config:
assets/config.example.json
Error and Boundary Handling
- Feed HTTP/network failure: keep syncing other feeds and record
last_error. - Feed
304 Not Modified: skip entry parsing and keep state. - Missing
guidandlink: use hashed fallback identity and setmatch_confidence=low. - Dependency missing (
feedparser): return install guidance.
Final Output Checklist (Required)
- core goal
- trigger conditions
- input requirements
- metadata schema
- dedupe and sync rules
- command workflow
- configurable parameters
- error handling
Use the following simplified checklist verbatim when the user requests it:
核心目标
输入需求
触发条件
元数据模型
去重与同步规则
命令流程
可配置参数
错误处理
References
references/input-model.mdreferences/output-rules.mdreferences/time-range-rules.md
Assets
assets/hn-popular-blogs-2025.opml(candidate feed pool)assets/config.example.json
Scripts
scripts/rss_subscribe.py
More from fadeloo/skills
email-imap-fetch
Listen for one or more IMAP inboxes with the IDLE command, fetch unread email metadata plus text previews, and forward each message to OpenClaw webhooks. Use when tasks need near-real-time mailbox monitoring, multi-account inbox ingestion via environment variables, and automatic trigger delivery into OpenClaw automation.
8ai-tech-fulltext-fetch
Fetch and persist article full text for RSS entries already stored in SQLite by ai-tech-rss-fetch. Use when backfilling or incrementally syncing body text from entries.url or entries.canonical_url into a companion table for downstream indexing, retrieval, or summarization.
8ai-tech-summary
Retrieve time-windowed RSS evidence from SQLite and let the agent produce final summaries using RAG over selected records and fields. Use when generating daily, weekly, monthly, or custom-range AI tech digests directly in agent responses instead of fixed template reports.
7email-smtp-send
Send emails through SMTP with optional local attachments and optional IMAP APPEND sync to Sent mailbox. Use when tasks need reliable outbound email delivery, attachment sending, SMTP connectivity checks, or cross-client sent-mail visibility (for example appending to "Sent Items" after SMTP send).
7sustainability-rss-fetch
Ingest all sustainability journal RSS entries into a dedicated RSS SQLite database first, keyed by DOI, then mark relevance and prune non-relevant rows to DOI-only. Use when building a DOI-first ingestion pipeline with mandatory full ingestion before topic filtering.
7sustainability-summary
Retrieve time-windowed relevant sustainability RSS evidence from the RSS metadata SQLite database and optionally join DOI-keyed enriched content from a separate fulltext SQLite database. Use when generating grounded daily, weekly, monthly, or custom-range digests after relevance labeling.
7