enrich
Enrich Skill
Enrich person and company pages from external sources. Scale effort to importance.
Contract
This skill guarantees:
- Every enriched page has compiled truth (State section) with inline citations
- Every enriched page has a timeline with dated entries
- Back-links are created bidirectionally
- Tiered enrichment: Tier 1 (full), Tier 2 (medium), Tier 3 (minimal) based on notability
- No stubs: every new page has meaningful content from web search or existing brain context
Filing rule: Read
skills/_brain-filing-rules.mdbefore creating any new page.
Convention: See
skills/conventions/quality.mdfor Iron Law back-linking.
Every mention of a person or company with a brain page MUST create a back-link
FROM that entity's page TO the page mentioning them. An unlinked mention is a
broken brain. See skills/_brain-filing-rules.md for format.
Philosophy
A brain page should read like an intelligence dossier, not a LinkedIn scrape. Facts are table stakes. Texture is the value -- what do they believe, what are they building, what makes them tick, where are they headed.
Citation Requirements (MANDATORY)
Convention: see
skills/conventions/quality.mdfor citation formats and source precedence.
When sources conflict, note the contradiction with both citations.
When To Enrich
Primary triggers
- User mentions an entity in conversation
- Entity appears in a meeting transcript or email
- New contact appears with significant context
- Entity makes news or has a major event
- Any ingest pipeline encounters a notable entity
Do NOT enrich
- Random mentions with no relationship signal
- Bot/spam accounts
- Entities with no substantive connection to the user's work
- Same page enriched within the past week (unless new signal warrants it)
Enrichment Tiers
Scale enrichment to importance. Don't waste API calls on low-value entities.
| Tier | Who | Effort | Sources |
|---|---|---|---|
| 1 (key) | Inner circle, close collaborators, key contacts | Full pipeline | All available APIs + deep web research |
| 2 (notable) | Occasional interactions, industry figures | Moderate | Web research + social + brain cross-ref |
| 3 (minor) | Worth tracking, not critical | Light | Brain cross-ref + social lookup if handle known |
The Enrichment Protocol (7 Steps)
Step 1: Identify entities
Extract people, companies, concepts from the incoming signal.
Step 2: Check brain state
For each entity:
gbrain search "name"-- does a page already exist?- If yes: UPDATE path (add new signal, update compiled truth if material)
- If no: CREATE path (check notability gate first, then create)
Step 3: Extract signal from source
Don't just capture facts. Capture texture:
| Signal Type | What to Extract |
|---|---|
| Opinions, beliefs | What They Believe section |
| Current projects, features shipped | What They're Building section |
| Ambition, career arc, motivation | What Motivates Them section |
| Topics they return to obsessively | Hobby Horses section |
| Who they amplify, argue with, respect | Network / Relationships |
| Ascending, plateauing, pivoting? | Trajectory section |
| Role, company, funding, location | State section (hard facts) |
Step 4: External data source lookups
Priority order -- stop when you have enough signal for the entity's tier.
4a. Brain cross-reference (always, all tiers)
gbrain search "name"andgbrain query "what do we know about name"- Check related pages: company pages for person enrichment and vice versa
- This is free and often the richest source
4b. Web research (Tier 1 and 2)
- Use Perplexity, Brave Search, Exa, or equivalent web research tool
- Key pattern: Send existing brain knowledge as context so the search returns DELTA (what's new vs what you already know), not a rehash
- Opus-class models for Tier 1 deep research, lighter models for Tier 2
4c. Social media lookup (all tiers when handle known)
- Pull recent posts/tweets for tone, interests, current focus
- Social media is the highest-texture signal for what someone actually thinks
4d. People enrichment APIs (Tier 1)
- LinkedIn data, career history, connections, education
4e. Company enrichment APIs (Tier 1)
- Company data, financials, headcount, key hires, recent news
| Data Need | Example Sources | Tier |
|---|---|---|
| Web research | Perplexity, Brave, Exa | 1-2 |
| LinkedIn / career | Crustdata, Proxycurl, People Data Labs | 1 |
| Career history | Happenstance, LinkedIn | 1 |
| Funding / company data | Crunchbase, PitchBook, Clearbit | 1 |
| Social media | Platform APIs, web scraping | 1-3 |
| Meeting history | Calendar/meeting transcript tools | 1-2 |
Step 5: Save raw data (preserves provenance)
Store raw API responses via put_raw_data in gbrain:
{
"source": "crustdata",
"fetched_at": "2026-04-11T...",
"query": "jane doe",
"data": { ... }
}
Raw data preserves provenance. If the compiled truth is ever questioned, the raw data shows exactly what the API returned.
Step 6: Write to brain
CREATE path
- Check notability gate (see
skills/_brain-filing-rules.md) - Check filing rules -- where does this entity go?
- Create page with the appropriate template (below)
- Fill compiled truth with citations
- Add first timeline entry
- Leave empty sections as
[No data yet](don't fill with boilerplate)
UPDATE path
- Add new timeline entries (reverse-chronological, append-only)
- Update compiled truth ONLY if the new signal materially changes the picture
- Update State section with new facts
- Flag contradictions between new signal and existing compiled truth
- Don't overwrite user-written assessments with API boilerplate
Person page template
---
title: Full Name
type: person
created: YYYY-MM-DD
updated: YYYY-MM-DD
tags: []
company: Current Company
relationship: How the user knows them
email:
linkedin:
twitter:
location:
---
# Full Name
> 1-paragraph executive summary: HOW do you know them, WHY do they matter,
> what's the current state of the relationship.
## State
Role, company, key context. Hard facts only.
## What They Believe
Ideology, first principles, worldview. What hills do they die on?
## What They're Building
Current projects, recent launches, what they're focused on.
## What Motivates Them
Ambition, career arc, what drives them.
## Hobby Horses
Topics they return to obsessively. Recurring themes in their work/posts.
## Assessment
Your read on this person. Strengths, gaps, trajectory.
## Trajectory
Ascending, plateauing, pivoting, declining? Where are they headed?
## Relationship
History of interactions, shared context, relationship quality.
## Contact
Email, social handles, preferred communication channel.
## Network
Key connections, mutual contacts, organizational relationships.
## Open Threads
Active conversations, pending items, things to follow up on.
---
## Timeline
Reverse chronological. Every entry has a date and [Source: ...] citation.
- **YYYY-MM-DD** | Event description [Source: ...]
Company page template
---
title: Company Name
type: company
created: YYYY-MM-DD
updated: YYYY-MM-DD
tags: []
---
# Company Name
> 1-paragraph executive summary.
## State
What they do, stage, key people, key metrics, your connection.
## Open Threads
Active items, pending decisions, things to track.
---
## Timeline
- **YYYY-MM-DD** | Event description [Source: ...]
Step 7: Cross-reference
- Update company pages from person enrichment (and vice versa)
- Update related project/deal pages if relevant context surfaced
- Check index files if the brain uses them
Note (v0.10.1): Links between brain pages are auto-created on every
put_page call (auto-link post-hook). Step 7 focuses on content
cross-references (updating related pages' compiled truth with new signal
from this enrichment), not on creating links. Verify via the auto_links
field in the put_page response ({ created, removed, errors }).
Timeline entries still need explicit gbrain timeline-add calls.
Bulk Enrichment Rules
- Test on 3-5 entities first. Read actual output. Check quality.
- Only proceed to bulk after test shots pass your quality bar.
- 3+ entities from one source -> batch process or spawn sub-agent
- Throttle API calls. Respect rate limits.
- Commit every 5-10 entities during bulk runs.
- Save a report after bulk enrichment (see Report Storage below).
Validation Rules
- Connection count < 20 on LinkedIn = likely wrong person, skip
- Name mismatch between brain and API = skip, flag for review
- Joke profiles or obviously wrong data = save to raw, don't update page
- Don't overwrite user-written assessments with API boilerplate
- When in doubt: save raw data but don't update brain page
Report Storage
After enrichment sweeps, save a report:
- Number of entities processed
- New pages created vs existing updated
- Data sources called and results quality
- Notable discoveries or contradictions
- Validation flags or API failures
This creates an audit trail for brain enrichment over time.
Anti-Patterns
- Creating stub pages with no content
- Enriching without checking brain first
- Overwriting user's direct statements with API data
- Creating pages for non-notable entities
Output Format
An enriched person page contains:
- Frontmatter with type, tags, company, relationship, and contact fields
- Executive summary (1 paragraph: how you know them, why they matter, relationship state)
- State section with hard facts and inline
[Source: ...]citations - Texture sections (What They Believe, What They're Building, What Motivates Them, Hobby Horses)
- Assessment with trajectory read
- Relationship history and contact info
- Network connections and mutual contacts
- Timeline in reverse chronological order, every entry dated with source citation
An enriched company page contains:
- Frontmatter with type and tags
- Executive summary (1 paragraph)
- State section (what they do, stage, key people, metrics, your connection)
- Open Threads (active items, pending decisions)
- Timeline in reverse chronological order with dated, cited entries
Both page types have bidirectional back-links to every entity they mention.
Tools Used
- Read a page from gbrain (get_page)
- Store/update a page in gbrain (put_page)
- Add a timeline entry in gbrain (add_timeline_entry)
- List pages in gbrain by type (list_pages)
- Store raw API data in gbrain (put_raw_data)
- Retrieve raw data from gbrain (get_raw_data)
- Link entities in gbrain (add_link)
- Check backlinks in gbrain (get_backlinks)