findall-api

SKILL.md

FindAll API - Complete Reference

The FindAll API discovers and evaluates entities that match complex criteria from natural language objectives. Submit a high-level goal and the service automatically generates structured match conditions, discovers relevant candidates, and evaluates each against the criteria. Returns comprehensive results with detailed reasoning, citations, and confidence scores for each match decision.

Table of Contents


Quickstart

Basic FindAll Run

from parallel import Parallel

client = Parallel(api_key="your_api_key")

# Create a FindAll run
run = client.findall.runs.create(
    objective="Find all AI companies that raised Series A funding in 2024",
    entity_type="companies",
    match_conditions=[
        {
            "name": "developing_ai_products_check",
            "description": "Company must be developing artificial intelligence (AI) products"
        },
        {
            "name": "raised_series_a_2024_check",
            "description": "Company must have raised Series A funding in 2024"
        }
    ],
    generator="core",
    match_limit=50
)

# Poll for results
while run.status.is_active:
    run = client.findall.runs.retrieve(run.findall_id)
    time.sleep(5)

# Get final results
result = client.findall.runs.result(run.findall_id)
for candidate in result.candidates:
    if candidate.match_status == "matched":
        print(f"Match: {candidate.name} - {candidate.url}")

Using Ingest for Auto-Generation

# Let the API generate match conditions from your objective
schema = client.findall.ingest.create(
    objective="Find all AI companies that raised Series A funding in 2024"
)

# Review and customize the generated schema
print(f"Entity type: {schema.entity_type}")
print(f"Match conditions: {schema.match_conditions}")

# Create run with the generated schema
run = client.findall.runs.create(
    objective=schema.objective,
    entity_type=schema.entity_type,
    match_conditions=schema.match_conditions,
    generator="core",
    match_limit=50
)

Core Concepts

Candidates

A candidate represents a potential match for your FindAll objective. Candidates progress through different states during evaluation:

Match Statuses

  • generated: Candidate has been discovered but not yet evaluated
  • matched: Candidate satisfies all match conditions
  • unmatched: Candidate fails to satisfy one or more match conditions
  • discarded: Candidate was determined to be irrelevant or duplicate

Candidate Structure

{
  "candidate_id": "candidate_7594eb7c-4f4a-487f-9d0c-9d1e63ec240c",
  "name": "Cognition AI",
  "url": "cognition.ai",
  "description": "AI software engineering company",
  "match_status": "matched",
  "output": {
    "developing_ai_products_check": "yes",
    "raised_series_a_2024_check": "yes"
  },
  "basis": [
    {
      "field": "developing_ai_products_check",
      "citations": [
        {
          "title": "Cognition - Devin and Cognition AI",
          "url": "https://cognition.ai/",
          "excerpts": ["We're the makers of Devin..."]
        }
      ],
      "reasoning": "The search results repeatedly state that Cognition AI is an 'applied AI lab building the future of software engineering'...",
      "confidence": "high"
    }
  ]
}

Output Field

The output object contains the evaluation results for each match condition. Each field in output corresponds to a match condition name and contains the evaluation result (typically "yes" or "no").

Basis Field

The basis array provides evidence supporting each output field:

  • field: Name of the output field being supported
  • citations: Web sources with URLs, titles, and excerpts
  • reasoning: Explanation of how the evidence supports the conclusion
  • confidence: Confidence level (low/medium/high) when available

Lifecycle

FindAll runs progress through several states:

queued → running → completed
         action_required (if needed)
         cancelling → cancelled
           failed

Status Descriptions

  • queued: Run is waiting to start
  • action_required: Run needs user input (rare)
  • running: Actively discovering and evaluating candidates
  • completed: Run finished successfully
  • failed: Run encountered an error
  • cancelling: Cancellation in progress
  • cancelled: Run was cancelled by user

Run Object

{
  "findall_id": "findall_56ccc4d188fb41a0803a935cf485c774",
  "status": {
    "status": "running",
    "is_active": true,
    "metrics": {
      "generated_candidates_count": 10,
      "matched_candidates_count": 3
    }
  },
  "generator": "core",
  "metadata": {},
  "created_at": "2025-09-10T21:02:08.626446Z",
  "modified_at": "2025-09-10T21:02:08.627376Z"
}

Termination Reasons

When a run reaches a terminal state, termination_reason explains why:

  • match_limit_met: Found the requested number of matches
  • candidates_exhausted: No more candidates to evaluate
  • low_match_rate: Too few matches found relative to candidates evaluated
  • user_cancelled: User cancelled the run
  • error_occurred: System error
  • timeout: Run exceeded time limit

Generators & Pricing

Generators control the quality, speed, and cost of FindAll runs:

Generator Quality Speed Price per Match
base Good Fast $0.30
core Better Moderate $1.50
pro Best Slower $3.00
preview Experimental Varies $1.50

Selecting a Generator

# Fast and economical
run = client.findall.runs.create(
    objective="...",
    generator="base",
    match_limit=100
)

# Balanced quality and cost (recommended)
run = client.findall.runs.create(
    objective="...",
    generator="core",
    match_limit=50
)

# Maximum quality
run = client.findall.runs.create(
    objective="...",
    generator="pro",
    match_limit=20
)

Cost Calculation

Cost is calculated per matched candidate only. Unmatched, discarded, or generated candidates don't incur charges.

Example:

  • Generator: core ($1.50 per match)
  • Matched candidates: 15
  • Total cost: 15 × $1.50 = $22.50

API Operations

Create & Ingest

Ingest FindAll Run

Transforms a natural language objective into a structured FindAll specification.

Note: Requires parallel-beta header.

Endpoint: POST /v1beta/findall/ingest

Request:

{
  "objective": "Find all AI companies that raised Series A funding in 2024"
}

Response:

{
  "objective": "Find all AI companies that raised Series A funding in 2024",
  "entity_type": "companies",
  "match_conditions": [
    {
      "name": "developing_ai_products_check",
      "description": "Company must be developing artificial intelligence (AI) products"
    },
    {
      "name": "raised_series_a_2024_check",
      "description": "Company must have raised Series A funding in 2024"
    }
  ],
  "generator": "core"
}

Python SDK:

schema = client.findall.ingest.create(
    objective="Find all AI companies that raised Series A funding in 2024"
)

Error Responses:

  • 422: Validation error

Create FindAll Run

Starts a FindAll run that discovers and evaluates entities.

Endpoint: POST /v1beta/findall/runs

Request:

{
  "objective": "Find all AI companies that raised Series A funding in 2024",
  "entity_type": "companies",
  "match_conditions": [
    {
      "name": "developing_ai_products_check",
      "description": "Company must be developing artificial intelligence (AI) products"
    },
    {
      "name": "raised_series_a_2024_check",
      "description": "Company must have raised Series A funding in 2024"
    }
  ],
  "generator": "core",
  "match_limit": 50,
  "exclude_list": [
    {
      "name": "OpenAI",
      "url": "openai.com"
    }
  ],
  "metadata": {
    "project": "Q1 research"
  },
  "webhook": {
    "url": "https://example.com/webhook",
    "event_types": ["task_run.status"]
  }
}

Response:

{
  "findall_id": "findall_56ccc4d188fb41a0803a935cf485c774",
  "status": {
    "status": "queued",
    "is_active": true,
    "metrics": {
      "generated_candidates_count": 0,
      "matched_candidates_count": 0
    }
  },
  "generator": "core",
  "metadata": {
    "project": "Q1 research"
  },
  "created_at": "2025-09-10T21:02:08.626446Z",
  "modified_at": "2025-09-10T21:02:08.627376Z"
}

Python SDK:

run = client.findall.runs.create(
    objective="Find all AI companies that raised Series A funding in 2024",
    entity_type="companies",
    match_conditions=[
        {
            "name": "developing_ai_products_check",
            "description": "Company must be developing artificial intelligence (AI) products"
        }
    ],
    generator="core",
    match_limit=50,
    exclude_list=[
        {"name": "OpenAI", "url": "openai.com"}
    ],
    metadata={"project": "Q1 research"}
)

Parameters:

Parameter Type Required Description
objective string Yes Natural language description of what to find
entity_type string Yes Type of entity (e.g., "companies", "people")
match_conditions array Yes List of conditions entities must satisfy
generator string Yes One of: base, core, pro, preview
match_limit integer Yes Max matches to find (5-1000)
exclude_list array No Entities to exclude from results
metadata object No Custom metadata (string, int, float, bool values)
webhook object No Webhook configuration for notifications

Error Responses:

  • 402: Insufficient credit
  • 422: Validation error (invalid parameters)
  • 429: Rate limit exceeded

Retrieve & Monitor

Retrieve FindAll Run Status

Get the current status of a FindAll run.

Endpoint: GET /v1beta/findall/runs/{findall_id}

Python SDK:

run = client.findall.runs.retrieve("findall_56ccc4d188fb41a0803a935cf485c774")
print(f"Status: {run.status.status}")
print(f"Matches: {run.status.metrics.matched_candidates_count}")

Get FindAll Run Result

Retrieve the complete result snapshot including all evaluated candidates.

Endpoint: GET /v1beta/findall/runs/{findall_id}/result

Response:

{
  "run": {
    "findall_id": "findall_56ccc4d188fb41a0803a935cf485c774",
    "status": {
      "status": "running",
      "is_active": true,
      "metrics": {
        "generated_candidates_count": 1,
        "matched_candidates_count": 1
      }
    },
    "generator": "core",
    "metadata": {},
    "created_at": "2025-09-10T21:02:08.626446Z",
    "modified_at": "2025-09-10T21:02:08.627376Z"
  },
  "candidates": [
    {
      "candidate_id": "candidate_7594eb7c-4f4a-487f-9d0c-9d1e63ec240c",
      "name": "Cognition AI",
      "url": "cognition.ai",
      "match_status": "matched",
      "output": {
        "developing_ai_products_check": "yes",
        "raised_series_a_2024_check": "yes"
      },
      "basis": [...]
    }
  ],
  "last_event_id": "56cee734dbc84172bfc491327f2a0183"
}

Python SDK:

result = client.findall.runs.result("findall_56ccc4d188fb41a0803a935cf485c774")

# Access run metadata
print(f"Status: {result.run.status.status}")

# Iterate through candidates
for candidate in result.candidates:
    if candidate.match_status == "matched":
        print(f"{candidate.name}: {candidate.url}")

# Resume streaming from last event
last_event = result.last_event_id

Get FindAll Run Schema

Retrieve the schema (objective, entity type, match conditions) for a run.

Endpoint: GET /v1beta/findall/runs/{findall_id}/schema

Response:

{
  "objective": "Find all AI companies that raised Series A funding in 2024",
  "entity_type": "companies",
  "match_conditions": [
    {
      "name": "developing_ai_products_check",
      "description": "Company must be developing artificial intelligence (AI) products"
    }
  ],
  "enrichments": [
    {
      "processor": "core",
      "output_schema": {
        "json_schema": {
          "type": "object",
          "properties": {
            "ceo_name": {
              "type": "string",
              "description": "Name of the current CEO"
            }
          }
        },
        "type": "json"
      }
    }
  ],
  "generator": "core",
  "match_limit": 50
}

Python SDK:

schema = client.findall.runs.schema("findall_56ccc4d188fb41a0803a935cf485c774")

Modify & Control

Extend FindAll Run

Add more matches to an existing run by increasing the match limit.

Endpoint: POST /v1beta/findall/runs/{findall_id}/extend

Request:

{
  "additional_match_limit": 25
}

Response: Returns updated FindAll schema with new match limit.

Python SDK:

# Original run had match_limit=50
# This increases it to 75
schema = client.findall.runs.extend(
    findall_id="findall_56ccc4d188fb41a0803a935cf485c774",
    additional_match_limit=25
)

Use Cases:

  • Initial results were promising, want more matches
  • Market research needs expanded
  • Competitive analysis requires deeper coverage

Error Responses:

  • 404: FindAll run not found
  • 422: Additional match limit must be greater than 0

Add Enrichment to FindAll Run

Add structured data extraction to matched candidates.

Endpoint: POST /v1beta/findall/runs/{findall_id}/enrich

Request:

{
  "processor": "core",
  "output_schema": {
    "json_schema": {
      "type": "object",
      "properties": {
        "ceo_name": {
          "type": "string",
          "description": "Name of the current CEO of the company"
        },
        "funding_amount": {
          "type": "string",
          "description": "Total funding amount in USD"
        }
      },
      "required": ["ceo_name"]
    },
    "type": "json"
  },
  "mcp_servers": [
    {
      "type": "url",
      "url": "https://api.example.com/mcp",
      "name": "company_data",
      "headers": {
        "Authorization": "Bearer token"
      },
      "allowed_tools": ["get_company_info"]
    }
  ]
}

Response: Returns updated FindAll schema with enrichments.

Python SDK:

schema = client.findall.runs.enrich(
    findall_id="findall_56ccc4d188fb41a0803a935cf485c774",
    processor="core",
    output_schema={
        "json_schema": {
            "type": "object",
            "properties": {
                "ceo_name": {"type": "string"},
                "employee_count": {"type": "integer"}
            }
        },
        "type": "json"
    }
)

Enriched Candidate Structure:

{
  "candidate_id": "candidate_123",
  "name": "Cognition AI",
  "match_status": "matched",
  "output": {
    "developing_ai_products_check": "yes",
    "raised_series_a_2024_check": "yes",
    "ceo_name": "Scott Wu",
    "employee_count": 50
  },
  "basis": [
    {
      "field": "ceo_name",
      "citations": [...],
      "reasoning": "...",
      "confidence": "high"
    }
  ]
}

Error Responses:

  • 404: FindAll run not found
  • 422: Validation error (invalid schema)

Cancel FindAll Run

Stop an active FindAll run.

Endpoint: POST /v1beta/findall/runs/{findall_id}/cancel

Python SDK:

client.findall.runs.cancel("findall_56ccc4d188fb41a0803a935cf485c774")

Notes:

  • Run status transitions to cancelling, then cancelled
  • Partial results remain accessible
  • Cannot cancel runs in terminal states (completed, failed, cancelled)

Error Responses:

  • 404: FindAll run not found
  • 409: Cannot cancel a terminated run

Advanced Features

Streaming Events

Monitor FindAll runs in real-time using Server-Sent Events (SSE).

Endpoint: GET /v1beta/findall/runs/{findall_id}/events

Query Parameters:

  • last_event_id (optional): Resume from specific event
  • timeout (optional): Connection timeout in seconds

Event Types:

  1. findall.schema.updated: Schema was modified
  2. findall.status: Run status changed
  3. findall.candidate.generated: New candidate discovered
  4. findall.candidate.matched: Candidate matched all conditions
  5. findall.candidate.unmatched: Candidate failed conditions
  6. findall.candidate.discarded: Candidate was discarded
  7. findall.candidate.enriched: Candidate enrichment completed
  8. error: Error occurred

Python SDK:

# Stream all events
for event in client.findall.runs.events("findall_56ccc4d188fb41a0803a935cf485c774"):
    if event.type == "findall.candidate.matched":
        candidate = event.data
        print(f"New match: {candidate.name}")
    elif event.type == "findall.status":
        run = event.data
        print(f"Status: {run.status.status}")

# Resume from last event
for event in client.findall.runs.events(
    findall_id="findall_123",
    last_event_id="56cee734dbc84172bfc491327f2a0183"
):
    process_event(event)

# With timeout
for event in client.findall.runs.events(
    findall_id="findall_123",
    timeout=60  # Close after 60 seconds
):
    process_event(event)

Example Event:

{
  "type": "findall.candidate.matched",
  "timestamp": "2025-09-10T21:02:08.626446Z",
  "event_id": "56cee734dbc84172bfc491327f2a0183",
  "data": {
    "candidate_id": "candidate_52e1e30b-4e0a-49d8-82eb-79e64e0ed015",
    "name": "Pika",
    "url": "pika.art",
    "match_status": "matched",
    "output": {...},
    "basis": [...]
  }
}

Real-time Dashboard Example:

def create_live_dashboard(findall_id):
    matched = []

    for event in client.findall.runs.events(findall_id):
        if event.type == "findall.candidate.matched":
            matched.append(event.data.name)
            print(f"\rMatches: {len(matched)}", end="")

        elif event.type == "findall.status":
            if not event.data.status.is_active:
                print(f"\n\nFinal count: {len(matched)}")
                break

Enrichments

Add structured data fields to matched candidates after initial matching.

When to Use Enrichments:

  • Extract additional details from matched entities
  • Gather data not needed for matching criteria
  • Add custom fields using MCP servers

Example Flow:

# 1. Create run without enrichments
run = client.findall.runs.create(
    objective="Find AI companies",
    entity_type="companies",
    match_conditions=[...],
    generator="core",
    match_limit=50
)

# 2. Wait for some matches
while run.status.metrics.matched_candidates_count < 10:
    time.sleep(5)
    run = client.findall.runs.retrieve(run.findall_id)

# 3. Add enrichment for matched candidates
schema = client.findall.runs.enrich(
    findall_id=run.findall_id,
    processor="core",
    output_schema={
        "json_schema": {
            "type": "object",
            "properties": {
                "ceo_name": {"type": "string"},
                "headquarters": {"type": "string"},
                "employee_count": {"type": "integer"}
            }
        },
        "type": "json"
    }
)

# 4. Get enriched results
result = client.findall.runs.result(run.findall_id)
for candidate in result.candidates:
    if candidate.match_status == "matched":
        print(f"{candidate.name}: CEO = {candidate.output.get('ceo_name')}")

Using MCP Servers:

schema = client.findall.runs.enrich(
    findall_id=run.findall_id,
    processor="core",
    output_schema={
        "json_schema": {
            "type": "object",
            "properties": {
                "stock_price": {"type": "number"},
                "market_cap": {"type": "string"}
            }
        },
        "type": "json"
    },
    mcp_servers=[
        {
            "type": "url",
            "url": "https://api.stockdata.com/mcp",
            "name": "stock_data",
            "headers": {"API-Key": "secret"},
            "allowed_tools": ["get_stock_price", "get_market_cap"]
        }
    ]
)

Webhooks

Receive HTTP notifications when FindAll events occur.

Configuration:

run = client.findall.runs.create(
    objective="...",
    match_conditions=[...],
    generator="core",
    match_limit=50,
    webhook={
        "url": "https://your-app.com/webhook",
        "event_types": ["task_run.status"]
    }
)

Webhook Payload:

{
  "type": "findall.candidate.matched",
  "timestamp": "2025-09-10T21:02:08.626446Z",
  "event_id": "56cee734dbc84172bfc491327f2a0183",
  "findall_id": "findall_56ccc4d188fb41a0803a935cf485c774",
  "data": {
    "candidate_id": "candidate_123",
    "name": "Company Name",
    "url": "company.com",
    "match_status": "matched"
  }
}

Webhook Handler Example:

from flask import Flask, request

app = Flask(__name__)

@app.route('/webhook', methods=['POST'])
def handle_webhook():
    event = request.json

    if event['type'] == 'findall.candidate.matched':
        candidate = event['data']
        # Store in database
        db.save_match(candidate)

    elif event['type'] == 'findall.status':
        run = event['data']
        if run['status']['status'] == 'completed':
            # Send notification
            notify_team(f"FindAll {event['findall_id']} completed")

    return '', 200

Preview & Refresh

Preview Mode

Test your FindAll configuration before committing to a full run.

Use Cases:

  • Validate match conditions
  • Test different generators
  • Estimate costs
  • Debug criteria

How It Works:

  1. Use generator="preview"
  2. Run evaluates a small sample of candidates
  3. Review results to refine conditions
  4. Create full run with optimized config

Example:

# Preview run
preview = client.findall.runs.create(
    objective="Find AI companies with SOC2 certification",
    entity_type="companies",
    match_conditions=[
        {
            "name": "soc2_certified",
            "description": "Company has SOC2 Type II certification"
        }
    ],
    generator="preview",
    match_limit=10  # Small sample
)

# Review results
result = client.findall.runs.result(preview.findall_id)
match_rate = result.run.status.metrics.matched_candidates_count / max(1, result.run.status.metrics.generated_candidates_count)

print(f"Match rate: {match_rate:.1%}")

# Adjust conditions if needed
if match_rate < 0.1:
    print("Match conditions may be too strict")
elif match_rate > 0.5:
    print("Match conditions may be too loose")

# Run full search with validated config
full_run = client.findall.runs.create(
    objective=preview.objective,
    entity_type=preview.entity_type,
    match_conditions=preview.match_conditions,
    generator="core",
    match_limit=100
)

Refresh Results

Re-evaluate candidates with updated web data or revised match conditions.

When to Refresh:

  • Source websites have updated
  • Need fresher data
  • Want to recheck with stricter/looser criteria

Example:

# Original run from 3 months ago
old_run = client.findall.runs.retrieve("findall_old_123")

# Create new run with same config but fresh data
fresh_run = client.findall.runs.create(
    objective=old_run.objective,
    entity_type=old_run.entity_type,
    match_conditions=old_run.match_conditions,
    generator=old_run.generator,
    match_limit=old_run.match_limit
)

# Compare results
old_result = client.findall.runs.result(old_run.findall_id)
new_result = client.findall.runs.result(fresh_run.findall_id)

old_matches = {c.name for c in old_result.candidates if c.match_status == "matched"}
new_matches = {c.name for c in new_result.candidates if c.match_status == "matched"}

print(f"Newly matched: {new_matches - old_matches}")
print(f"No longer match: {old_matches - new_matches}")

Migration Guide

From Tasks API to FindAll API

The FindAll API is purpose-built for discovering multiple entities, replacing the pattern of running many parallel Task API calls.

Before: Tasks API Pattern

# Old approach: Multiple task runs for discovery
companies = ["Company A", "Company B", "Company C", ...]

results = []
for company in companies:
    task = client.tasks.create(
        objective=f"Check if {company} has SOC2 certification",
        processor="core"
    )
    results.append(task)

# Wait and aggregate
matches = [r for r in results if r.output.get("has_soc2") == "yes"]

Problems:

  • Manual candidate list required
  • High cost (every task charged)
  • Slow (sequential or complex parallel code)
  • Limited discovery (only checks provided list)

After: FindAll API

# New approach: Single FindAll run
run = client.findall.runs.create(
    objective="Find companies with SOC2 Type II certification",
    entity_type="companies",
    match_conditions=[
        {
            "name": "soc2_type_ii_check",
            "description": "Company must have SOC2 Type II certification"
        }
    ],
    generator="core",
    match_limit=50
)

# Automatically discovers and evaluates candidates
result = client.findall.runs.result(run.findall_id)
matches = [c for c in result.candidates if c.match_status == "matched"]

Benefits:

  • Automatic candidate discovery
  • Only pay for matches
  • Parallel evaluation at scale
  • Comprehensive coverage

Migration Checklist

  1. Replace manual lists: Let FindAll discover candidates
  2. Combine match logic: Use match_conditions instead of separate tasks
  3. Use streaming: Replace polling with SSE for real-time updates
  4. Add enrichments: Extract additional data only for matches
  5. Leverage generators: Choose appropriate quality/cost tradeoff

Feature Comparison

Feature Tasks API FindAll API
Discovery Manual list Automatic
Parallelization Custom code Built-in
Pricing Per task Per match
Monitoring Poll individual tasks Single stream
Enrichment Separate tasks Native support
Excluding duplicates Manual Built-in
Cost control Hard to predict match_limit cap

Code Examples

Task pattern → FindAll equivalent:

# OLD: Check multiple URLs
for url in urls:
    client.tasks.create(
        objective=f"Extract CEO name from {url}",
        processor="core"
    )

# NEW: Single FindAll with enrichment
run = client.findall.runs.create(
    objective="Find all tech companies",
    match_conditions=[...],
    generator="core",
    match_limit=100
)

client.findall.runs.enrich(
    findall_id=run.findall_id,
    output_schema={
        "json_schema": {
            "properties": {
                "ceo_name": {"type": "string"}
            }
        }
    }
)

Task group → FindAll:

# OLD: Task group for batch processing
group = client.tasks.groups.create([
    {"objective": "Check company A..."},
    {"objective": "Check company B..."},
    {"objective": "Check company C..."}
])

# NEW: FindAll discovers automatically
run = client.findall.runs.create(
    objective="Find companies matching criteria",
    match_conditions=[...],
    generator="core",
    match_limit=50
)

Best Practices

Writing Match Conditions

Be Specific:

# ❌ Too vague
"Company must be successful"

# ✅ Specific and verifiable
"Company must have raised Series A funding of at least $10M in 2024"

Include Evidence Hints:

{
    "name": "soc2_certified",
    "description": """
    Company must have SOC2 Type II certification (not Type I).
    Look for evidence in:
    - Trust centers
    - Security/compliance pages
    - Audit reports
    - Press releases specifically mentioning 'SOC2 Type II'

    If no explicit SOC2 Type II mention is found, consider requirement not satisfied.
    """
}

Use Negative Examples:

{
    "name": "series_a_only",
    "description": """
    Company must be at Series A stage (not seed, Series B, C, or later).
    Confirm they have closed Series A and have not announced Series B.
    """
}

Excluding Entities

Use the exclude_list parameter to prevent known entities from being evaluated:

run = client.findall.runs.create(
    objective="Find AI companies",
    match_conditions=[...],
    generator="core",
    match_limit=50,
    exclude_list=[
        {"name": "OpenAI", "url": "openai.com"},
        {"name": "Anthropic", "url": "anthropic.com"}
    ]
)

Cost Optimization

  1. Start with preview: Test with generator="preview" first
  2. Use base for scale: Use base generator for large searches (100+ matches)
  3. Set appropriate limits: Don't request more matches than you need
  4. Leverage extend: Start small, extend if needed rather than over-requesting
  5. Enrich selectively: Only add enrichments for final matched entities

Monitoring Best Practices

Use SSE for Real-time Needs:

# Real-time monitoring
for event in client.findall.runs.events(run.findall_id):
    if event.type == "findall.candidate.matched":
        process_immediately(event.data)

Use Polling for Async Workflows:

# Background job
while True:
    run = client.findall.runs.retrieve(run.findall_id)
    if not run.status.is_active:
        break
    time.sleep(30)

Use Webhooks for Integration:

# System-to-system integration
run = client.findall.runs.create(
    objective="...",
    webhook={
        "url": "https://your-system.com/findall-webhook",
        "event_types": ["task_run.status"]
    }
)

Error Handling

Common Errors

402 Payment Required:

try:
    run = client.findall.runs.create(...)
except Exception as e:
    if "insufficient credit" in str(e).lower():
        print("Add credits to your account")

422 Validation Error:

try:
    run = client.findall.runs.create(
        match_limit=5000  # Over max of 1000
    )
except Exception as e:
    print(f"Invalid parameters: {e}")

429 Rate Limit:

import time

def create_with_retry(max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.findall.runs.create(...)
        except Exception as e:
            if "rate limit" in str(e).lower() and attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff
            else:
                raise

Event Stream Error Handling

def robust_stream(findall_id, max_reconnects=5):
    reconnects = 0
    last_event_id = None

    while reconnects < max_reconnects:
        try:
            for event in client.findall.runs.events(
                findall_id=findall_id,
                last_event_id=last_event_id
            ):
                if event.type == "error":
                    print(f"Error event: {event.error.message}")
                    continue

                last_event_id = event.event_id
                yield event

            break  # Stream ended normally

        except Exception as e:
            print(f"Stream interrupted: {e}")
            reconnects += 1
            time.sleep(min(30, 2 ** reconnects))

Complete Example: Company Research Pipeline

from parallel import Parallel
import time

client = Parallel(api_key="your_api_key")

# Step 1: Ingest objective
print("Generating FindAll schema...")
schema = client.findall.ingest.create(
    objective="Find YC-backed AI companies founded in 2023 with active products"
)

print(f"Generated {len(schema.match_conditions)} match conditions:")
for condition in schema.match_conditions:
    print(f"  - {condition['name']}")

# Step 2: Start FindAll run
print("\nStarting FindAll run...")
run = client.findall.runs.create(
    objective=schema.objective,
    entity_type=schema.entity_type,
    match_conditions=schema.match_conditions,
    generator="core",
    match_limit=50,
    metadata={"project": "Q1_2025_research"}
)

# Step 3: Monitor with SSE
print(f"\nMonitoring run {run.findall_id}...")
matched_count = 0

for event in client.findall.runs.events(run.findall_id):
    if event.type == "findall.candidate.matched":
        matched_count += 1
        candidate = event.data
        print(f"✓ Match #{matched_count}: {candidate.name}")

    elif event.type == "findall.status":
        run_status = event.data.status
        print(f"Status: {run_status.status} | "
              f"Generated: {run_status.metrics.generated_candidates_count} | "
              f"Matched: {run_status.metrics.matched_candidates_count}")

        if not event.data.status.is_active:
            break

# Step 4: Add enrichments for matched companies
print("\nAdding enrichments...")
client.findall.runs.enrich(
    findall_id=run.findall_id,
    processor="core",
    output_schema={
        "json_schema": {
            "type": "object",
            "properties": {
                "ceo_name": {"type": "string"},
                "employee_count": {"type": "integer"},
                "total_funding": {"type": "string"}
            }
        },
        "type": "json"
    }
)

# Wait for enrichments to complete
while True:
    run = client.findall.runs.retrieve(run.findall_id)
    if not run.status.is_active:
        break
    time.sleep(5)

# Step 5: Get final results
print("\nFinal results:")
result = client.findall.runs.result(run.findall_id)

for candidate in result.candidates:
    if candidate.match_status == "matched":
        output = candidate.output
        print(f"\n{candidate.name} ({candidate.url})")
        print(f"  CEO: {output.get('ceo_name', 'N/A')}")
        print(f"  Employees: {output.get('employee_count', 'N/A')}")
        print(f"  Funding: {output.get('total_funding', 'N/A')}")

print(f"\n✓ Found {matched_count} companies matching criteria")

API Reference Summary

Endpoint Method Purpose
/v1beta/findall/ingest POST Generate FindAll schema from objective
/v1beta/findall/runs POST Create FindAll run
/v1beta/findall/runs/{id} GET Get run status
/v1beta/findall/runs/{id}/result GET Get complete results
/v1beta/findall/runs/{id}/schema GET Get run schema
/v1beta/findall/runs/{id}/events GET Stream events (SSE)
/v1beta/findall/runs/{id}/extend POST Increase match limit
/v1beta/findall/runs/{id}/enrich POST Add enrichments
/v1beta/findall/runs/{id}/cancel POST Cancel run

Additional Resources


Last Updated: January 2025 API Version: 0.1.2

Weekly Installs
4
First Seen
2 days ago
Installed on
claude-code3
windsurf2
trae2
opencode2
codex2
antigravity2