list-building
List Building
Build company lists using Extruct API methods, guided by a decision tree. Reads from the company context file for ICP and seed companies.
Official API Reference
Decision Tree
Before running any queries, determine the right approach:
Have a seed company from win cases or context file?
YES → Method 1: Lookalike Search (pass seed domain)
NO ↓
New vertical, need broad exploration?
YES → Method 2: Semantic Search (3-5 queries from different angles)
NO ↓
Need qualification against specific criteria?
YES → Method 3: Discovery API (criteria-scored async research)
NO ↓
Need maximum coverage?
YES → Combine Search + Discovery (~15% overlap expected)
Before You Start
Read the company context file if it exists:
claude-code-gtm/context/{company}_context.md
Extract:
- ICP profiles — for query design and filters
- Win cases — for seed companies in lookalike mode
- DNC list — domains to exclude from results. If no DNC list exists in the context file, ask the user: (1) run an Extruct search for competitors to auto-populate, (2) accept a CSV of existing customers/partners, or (3) skip for now
Also check for a hypothesis set at claude-code-gtm/context/{vertical-slug}/hypothesis_set.md. If it exists, use the Search angle field from each hypothesis to design search queries — these are pre-defined query suggestions tailored to each pain point.
Environment
| Variable | Service |
|---|---|
EXTRUCT_API_TOKEN |
Extruct API |
Before making API calls, check that EXTRUCT_API_TOKEN is set by running test -n "$EXTRUCT_API_TOKEN" && echo "set" || echo "missing". If missing, ask the user to provide their Extruct API token and set it via export EXTRUCT_API_TOKEN=<value>. Do not proceed until confirmed.
Base URL: https://api.extruct.ai/v1
Method 1: Lookalike Search
Use when you have a seed company (from win cases, existing customers, or user input).
Endpoint: GET /companies/{identifier}/similar where identifier is a domain or company UUID.
Key params:
filters— JSON withinclude(size, country) andrange(founded)limit— max results (default 100, up to 200)offset— for pagination
Response fields: name, domain, short_description, founding_year, employee_count, hq_country, hq_city, relevance_score
When to use lookalike:
- You have a happy customer and want more like them
- Context file has win cases with domains
- User says "find companies similar to X"
Tips:
- Run multiple similar-company searches with different seed companies for broader coverage
- Combine with filters to constrain geography or size
- Deduplicate across runs by domain
- Default to
limit=100; increase up to200when broader coverage is needed
Method 2: Semantic Search — Fast, Broad
Endpoint: GET /companies/search
Key params:
q— natural language query describing the target companiesfilters— JSON withinclude(size, country) andrange(founded)limit— max results (default 100, up to 200)
Response fields: name, domain, short_description, founding_year, employee_count, hq_country, hq_city, relevance_score
Query strategy:
- Write 3-5 queries per campaign, each from a different angle on the same ICP
- Describe the product/use case, not the company type
- Deduplicate across queries by domain — overlap is expected
- Default to
limit=100per query; increase up to200when needed - Target 200-800 companies total across all queries
Filters: See references/search-filters.md
Method 3: Discovery API — Deep, Qualified
Endpoint: POST /discovery_tasks
Key params:
query— 2-3 sentence description of the ideal company (like a job description)desired_num_results— target result count (default 50 for first pass)criteria— list of{ key, name, criterion }objects for auto-grading (up to 5)
Poll: GET /discovery_tasks/{task_id} — status: created | in_progress | done | failed. Poll every 60 seconds.
Fetch results: GET /discovery_tasks/{task_id}/results with limit and offset params.
Response fields: company_name, company_website, company_description, relevance (0-100), scores (per-criteria grade 1-5 with explanation), founding_year
Query strategy:
- Write queries like a job description — 2-3 sentences describing the ideal company
- Use criteria to auto-qualify — each company gets graded 1-5 per criterion
- Default
desired_num_results=50for first pass; expand after quality review - Use up to 5 criteria per task; keep criteria focused and non-overlapping
- Run separate tasks for different ICP segments
- Scans many candidates to find qualified matches — runtime depends on query scope
- Up to 250 results per task
See references/discovery-api.md for full parameters.
Upload to Table
Create a company kind table via POST /tables with a single input column (kind: "input", key: "input"). Extruct auto-enriches each domain with a Company Profile.
Upload domains in batches of 50 via POST /tables/{table_id}/rows. Each row: { "data": { "input": "domain.com" } }. Add 0.5s delay between batches.
Pass "run": true in the rows payload to trigger agent columns on upload.
Re-run After Enrichment
After the list-enrichment skill adds data points to this list, consider re-running list building using enrichment insights as Discovery criteria. For example:
- If enrichment reveals that "companies using legacy ERP" are the best fit, create a Discovery task with that as a criterion
- If enrichment shows a geographic cluster, run a Search with tighter geo filters
- This creates a feedback loop: list → enrich → learn → refine list
Result Size Guidance
| Campaign stage | Target list size | Method |
|---|---|---|
| Exploration | 50-100 | Search (2-3 queries) |
| First campaign | 200-500 | Search (5 queries) + Discovery |
| Scaling | 500-2000 | Discovery (high desired_num_results) + multiple Search |
Workflow
Step 0: Verify API reference
- Read local references: references/discovery-api.md, references/search-filters.md
- Fetch live docs: https://www.extruct.ai/docs
- Compare endpoints, params, and response fields
- If discrepancies found:
- Update the local reference file(s)
- Flag changes to the user before proceeding
- Proceed with the skill workflow
Steps
- Read context file for ICP, seed companies, and DNC list
- Follow the decision tree to pick the right method
- Draft queries (3-5 for Search, 1-2 for Discovery)
- Run queries and collect results
- Deduplicate across all results by domain
- Remove DNC domains
- Upload to Extruct company table for auto-enrichment
- Add agent columns if user needs custom research
- Ask user for preferred output: Extruct table link, local CSV, or both