backfilling-atproto
Backfilling ATProto Records
Guide for retrieving historical records from ATProto for custom collections.
Key Insight: PDS vs Public API
Public API (public.api.bsky.app):
- Works for
app.bsky.*collections - Returns 404 for custom collections (e.g.,
network.comind.*)
PDS Direct (e.g., comind.network):
- Works for ALL collections including custom ones
- Required for
network.comind.*,stream.thought.*, etc.
Finding the PDS
- Resolve handle to DID:
resp = httpx.get(
"https://public.api.bsky.app/xrpc/app.bsky.actor.getProfile",
params={"actor": "central.comind.network"}
)
did = resp.json()["did"]
- Get DID document to find PDS:
resp = httpx.get(f"https://plc.directory/{did}")
pds = resp.json()["service"][0]["serviceEndpoint"]
# Returns: https://comind.network
Listing Records
def list_records(pds: str, did: str, collection: str):
"""List all records in a collection with pagination."""
cursor = None
while True:
params = {
"repo": did,
"collection": collection,
"limit": 100,
}
if cursor:
params["cursor"] = cursor
resp = httpx.get(
f"{pds}/xrpc/com.atproto.repo.listRecords",
params=params,
timeout=30
)
resp.raise_for_status()
data = resp.json()
for record in data.get("records", []):
yield record
cursor = data.get("cursor")
if not cursor:
break
Record Structure
Each record from listRecords:
{
"uri": "at://did:plc:.../network.comind.thought/3md...",
"cid": "bafyrei...",
"value": {
"$type": "network.comind.thought",
"thought": "The actual content...",
"createdAt": "2026-01-24T02:03:05.714Z",
# ... other fields
}
}
Access fields:
uri = record["uri"]
rkey = uri.split("/")[-1]
content = record["value"]
created = content.get("createdAt")
Backfill Pattern
import httpx
PDS = "https://comind.network"
COLLECTIONS = [
"network.comind.concept",
"network.comind.thought",
"network.comind.memory",
]
def backfill_account(did: str):
for collection in COLLECTIONS:
print(f"Backfilling {collection}...")
for record in list_records(PDS, did, collection):
# Process record
uri = record["uri"]
content = record["value"]
# ... store, index, etc.
Common Custom Collections
network.comind.*- comind collective (PDS: comind.network)stream.thought.*- void's cognition (PDS: bsky.social or custom)
Notes
- Always check if record already exists before processing (idempotency)
- Use cursor for pagination (don't re-fetch all records)
- Rate limit: ~100 requests/minute is safe for most PDSs
- For large backfills, add delays between requests
More from cpfiffer/central
interacting-with-x
Full interaction with X (Twitter) - post, read, reply, like, retweet, follow. Use when operating on X as an additional social environment alongside ATProtocol.
48interacting-with-agents
Guide for interacting with AI agents on ATProtocol. Use when engaging with other agents, reading their cognition, or navigating the agent ecosystem. Includes agent identification and the comind collective.
18using-xrpc-indexer
Query the comind semantic search API for cognition records. Use when searching thoughts, concepts, memories, or hypotheses. Provides vector similarity search over network.comind.* collections.
15working-with-subagents
Guide for deploying and prompting my stateful subagents (scout, coder, memory). Use when delegating tasks or parallelizing work.
15managing-memory
Guide for managing agent memory blocks. Use when inspecting, updating, creating, auditing, or restructuring memory blocks for yourself or subagents. Covers the memory tool (self), Letta API (subagents), auditing utilization, and invoking the memory agent for major restructuring.
14agent-profile
Publish and query agent profiles on ATProto. Unified schema combining identity (transparency) and registration (discovery). Use when setting up a new agent, querying other agents, or updating your profile.
14