Causal Inference with LLMs

Leverage reasoning LLMs to discover causal relationships and generate directed acyclic graphs (DAGs). This skill enables Pearl's Ladder of Causation "Why" level - inferring underlying data generation processes by combining observational patterns with domain expertise.

Core Capabilities

1. Pairwise Causal Relationships (A → B)

Assess potential causal links between treatment and outcome variables using LLM reasoning and domain knowledge.

from causal import ModelSuggester

modeler = ModelSuggester('gpt-5.1-mini')
experts = modeler.suggest_domain_expertises(event_labels, n_experts=5)

modeler.suggest_pairwise_relationship(
    experts,
    'Treatment event',
    'Outcome event'
)

2. Latent Confounder Identification (A ← C → B)

Identify unobserved variables that influence both treatment and outcome.

_, latent_confounders = modeler.suggest_latent_confounders(
    treatment_variable,
    outcome_variable,
    experts
)

3. Complete Causal Graph Construction

Build full DAGs with directional relationships, reasoning, and confidence estimates.

all_factors = ["factor1", "factor2", "factor3"]
pairwise_edges = await modeler._suggest_pairwise_relationships(
    expert=str(domain_expertises),
    all_factors=all_factors,
    analysis_context="causal model"
)

4. Interactive Graph Visualization

Generate interactive HTML visualizations of causal graphs.

from causal import plot_interactive_graph
import networkx as nx

G = nx.MultiDiGraph()
for idx, label in enumerate(all_factors):
    G.add_node(idx, desc=label)

for edge, (confidence, thinking) in pairwise_edges.items():
    G.add_edge(edge[0], edge[1], relationship=confidence, thinking=thinking)

plot_interactive_graph(G, filename="output/causal_graph.html")

Workflow

Basic Analysis Flow

Define Variables: Specify treatment, outcome, and known factors
Identify Domain Expertise: Generate relevant expert perspectives
Discover Confounders: Find latent variables affecting both treatment and outcome
Build Pairwise Relationships: Assess all potential causal links
Construct DAG: Assemble complete causal graph
Visualize: Generate interactive graph representation

Example: Medical Causality

import asyncio
from causal import ModelSuggester, plot_interactive_graph
import networkx as nx

async def analyze_medical_causality():
    modeler = ModelSuggester('o3-mini')

    all_factors = ["smoking", "lung cancer", "air pollution", "genetics"]
    treatment = "smoking"
    outcome = "lung cancer"

    # Get domain expertise
    experts = modeler.suggest_domain_expertises(all_factors, n_experts=3)

    # Find confounders
    _, latent_confounders = modeler.suggest_latent_confounders(
        treatment, outcome, experts
    )

    # Build complete factor list
    all_factors += latent_confounders
    all_factors = list(set([f.lower() for f in all_factors]))

    # Discover pairwise relationships
    pairwise_edges = await modeler._suggest_pairwise_relationships(
        expert=str(experts),
        all_factors=all_factors,
        analysis_context="medical causal model"
    )

    # Build and visualize graph
    G = nx.MultiDiGraph()
    for idx, label in enumerate(all_factors):
        G.add_node(idx, desc=label)

    for edge, (confidence, thinking) in pairwise_edges.items():
        G.add_edge(edge[0], edge[1],
                   relationship=confidence, thinking=thinking)

    plot_interactive_graph(G, filename="medical_causal_graph.html")

asyncio.run(analyze_medical_causality())

Configuration

The package uses config.yaml for LLM settings. Key parameters:

model: LLM model name (e.g., 'gpt-5.1-mini', 'o3-mini')
temperature: Reasoning randomness
max_tokens: Response length limit
domain_expert_count: Number of expert perspectives (default: 3-5)

Critical Limitations

AI reasoning is based on learned patterns and may contain biases or incorrect associations. Always:

Validate proposed relationships using additional causal inference techniques
Cross-check with domain experts and empirical evidence
Test causal hypotheses through experimentation when possible
Document confidence levels and reasoning chains
Never treat LLM suggestions as ground truth

Advanced Features

Relationship Strategies

Multiple strategies available via RelationshipStrategy enum:

parent: Parent-child relationships
child: Child-parent relationships
confounder: Confounding variables
iv: Instrumental variables
mediator: Mediating variables
pairwise: All pairwise relationships

Async Processing

Use async methods for efficient batch processing of multiple causal pairs.

Graph Export

Export graphs in multiple formats:

Interactive HTML (pyvis)
NetworkX graph objects
JSON/GraphML for external tools

References

For theoretical background, see references/causal-theory.md.

For complete API documentation, see references/api-reference.md.

causal-inference-llm