Binder Design Tool Selection

Decision tree

De novo binder design?
│
├─ Standard target → BoltzGen (recommended)
│   All-atom output (no separate ProteinMPNN step needed)
│   Better for ligand/small molecule binding
│   Single-step design (backbone + sequence + side chains)
│
├─ Need diversity/exploration → RFdiffusion + ProteinMPNN
│   Maximum backbone diversity
│   Two-step: backbone then sequence
│
├─ Integrated validation → BindCraft
│   Built-in AF2 validation
│   End-to-end pipeline
│
├─ Ligand binding → BoltzGen ✓
│   All-atom diffusion handles ligand context
│
├─ Peptide/nanobody → Germinal
│   VHH/nanobody design
│   Germline-aware optimization
│
└─ Antibody/Nanobody
    +-- VHH design --> germinal skill

Tool comparison

Tool	Strengths	Weaknesses	Best For
BoltzGen	All-atom, single-step, ligand-aware	Higher GPU requirement	Standard (recommended)
BindCraft	End-to-end, built-in AF2 validation	Less diverse	Production campaigns
RFdiffusion	High diversity, fast	Requires ProteinMPNN	Exploration, diversity
Germinal	Nanobody/VHH design	Specialized	Antibody optimization

Recommended Pipeline: BoltzGen → Chai → QC

BoltzGen provides all-atom design with built-in side-chain packing:

Target → BoltzGen → Validate → Filter
 (pdb)  (all-atom)   (chai)     (qc)

1. Target preparation

# Fetch structure from PDB
# Use pdb skill for guidance

Trim to binding region + 10A buffer
Remove waters and ligands
Renumber chains if needed

2. Hotspot selection

Choose 3-6 exposed residues
Prefer charged/aromatic residues
Cluster spatially (within 10-15A)

3. Design with BoltzGen (Recommended)

First, create a YAML config file (e.g., binder.yaml):

entities:
  - protein:
      id: B
      sequence: 70..100

  - file:
      path: target.cif
      include:
        - chain:
            id: A
      binding_types:
        - chain:
            id: A
            binding: 45,67,89

Then run:

modal run modal_boltzgen.py \
  --input-yaml binder.yaml \
  --protocol protein-anything \
  --num-designs 50

Why BoltzGen?

All-atom output (no separate ProteinMPNN step needed)
Better for ligand/small molecule binding
Single-step design (backbone + sequence + side chains)

4. Alternative: RFdiffusion Pipeline

For maximum diversity or when backbone-only is preferred:

# Step 1: Backbone generation
modal run modal_rfdiffusion.py \
  --pdb target.pdb \
  --contigs "A1-150/0 70-100" \
  --hotspot "A45,A67,A89" \
  --num-designs 500

# Step 2: Sequence design
modal run modal_ligandmpnn.py \
  --pdb-path backbone.pdb \
  --num-seq-per-target 16 \
  --sampling-temp 0.1

5. Validation

modal run modal_chai1.py \
  --input-faa sequences.fasta \
  --out-dir predictions/

6. Filtering

Apply standard thresholds:

pLDDT > 0.80
ipTM > 0.50
PAE_interface < 10
scRMSD < 2.0 A

See protein-qc skill for details.

Number of designs

Stage	Count	Purpose
Backbone generation	500-1000	Diversity
Sequences per backbone	8-16	Sequence space
AF2 predictions	All	Validation
After filtering	50-200	Candidates
Experimental testing	10-50	Final selection

Common mistakes

Wrong hotspots

Using buried residues
Too many hotspots (over-constrain)
Wrong chain/residue numbers

Insufficient diversity

Too few designs generated
Low temperature in ProteinMPNN
Not exploring multiple backbones

Poor target preparation

Including full protein instead of binding region
Missing important structural features
Wrong protonation states

Timeline guide

Step	Compute Time
RFdiffusion (500 designs)	2-4 hours
ProteinMPNN (8000 sequences)	1-2 hours
AF2 prediction (8000 sequences)	12-24 hours
Filtering and analysis	1-2 hours

Total: 1-2 days of compute

binder-design