dump-channel

Installation
SKILL.md

Dump Telegram Channel History

Archive a complete Telegram channel/group/chat to NDJSON + downloaded media files.

Self-Evolving Skill: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.

Preflight

  1. Session must exist: ~/.local/share/telethon/<profile>.session
    • If missing, run /tlg:setup first
  2. User must be subscribed to (or a member of) the target channel/chat

Usage

/usr/bin/env bash << 'EOF'
SCRIPT="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/cc-skills/plugins/tlg}/scripts/tg-cli.py"

# Full dump: NDJSON + all media (photos, videos, documents)
uv run --python 3.13 "$SCRIPT" dump @ChannelName ./output/ChannelName

# NDJSON only (skip media downloads — much faster)
uv run --python 3.13 "$SCRIPT" dump @ChannelName ./output/ChannelName --no-media

# Dump by numeric chat ID
uv run --python 3.13 "$SCRIPT" dump -1001234567890 ./output/MyChannel

# Use a different profile
uv run --python 3.13 "$SCRIPT" -p missterryli dump @ChannelName ./output/ChannelName
EOF

Parameters

Parameter Type Description
chat string/int Channel username (@name) or numeric chat ID
output path Output directory (messages.ndjson + media/ created inside)
--no-media flag Skip media downloads, produce NDJSON only

Output Structure

output/ChannelName/
├── messages.ndjson   ← one JSON object per line, chronological (oldest first)
└── media/
    ├── 6.jpg         ← named by message ID for cross-referencing
    ├── 12.png
    ├── 45.mp4
    └── ...

NDJSON Record Schema

Each line is a JSON object with these fields:

Field Type Description
id int Telegram message ID
date string ISO 8601 timestamp with timezone
text string/null Full message text (no truncation)
has_media bool Whether message contains media
media_type string/null Telethon class name (MessageMediaPhoto, etc.)
media_file string/null Filename in media/ dir (e.g., "6.jpg")
views int/null View count (channels only)
forwards int/null Forward count
reply_to_msg_id int/null Parent message ID if reply
grouped_id int/null Album group ID (shared across album messages)
edit_date string/null ISO 8601 timestamp of last edit
sender.id int Sender's Telegram user/channel ID
sender.name string Display name (channel title or user first name)
sender.username string/null @username if set

Resume Support

Re-running the same command skips already-downloaded media files (checks dest.exists()). The NDJSON is fully rewritten each run. This makes it safe to resume interrupted downloads.

Querying the Output

# jq: find all GOLD BUY signals with chart screenshots
jq 'select(.text != null and (.text | test("GOLD.*BUY")) and .media_file != null)' messages.ndjson

# DuckDB: aggregate by date
duckdb -c "SELECT date::DATE as day, count(*) FROM read_ndjson('messages.ndjson') GROUP BY day ORDER BY day"

# Python/Polars
import polars as pl
df = pl.read_ndjson("messages.ndjson")

Performance Notes

  • ~3000 messages + 1700 media files takes ~3-5 minutes
  • Telegram may briefly disconnect mid-download (Server closed the connection) — Telethon auto-reconnects
  • For very large channels (10k+ messages), expect 10-15 minutes with media

Recommended Storage Pattern

For git-tracked projects, gitignore the media folder:

# data/telegram/.gitignore
*/media/

This keeps the NDJSON (metadata) in version control while keeping large media files local-only.

Anti-Patterns

  • Don't dump channels you're not subscribed to — Telethon needs access via your account
  • Don't run multiple dumps concurrently on the same profile — session file contention

Post-Execution Reflection

After this skill completes, check before closing:

  1. Did the command succeed? — If not, fix the instruction or error table that caused the failure.
  2. Did parameters or output change? — If tg-cli.py's interface drifted, update Usage examples and Parameters table to match.
  3. Was a workaround needed? — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.

Only update if the issue is real and reproducible — not speculative.

Weekly Installs
17
GitHub Stars
37
First Seen
3 days ago