ipsae

SKILL.md

ipSAE Binder Ranking

Prerequisites

Requirement Minimum Recommended
Python 3.8+ 3.10
NumPy 1.20+ Latest
RAM 8GB 16GB

Overview

ipSAE (interprotein Score from Aligned Errors) is a scoring function for ranking protein-protein interactions predicted by AlphaFold2, AlphaFold3, and Boltz1. It outperforms ipTM and iPAE for binder design ranking with 1.4x higher precision in identifying true binders.

Paper: What's wrong with AlphaFold's ipTM score

How to run

Installation

git clone https://github.com/DunbrackLab/IPSAE.git
cd IPSAE
pip install numpy

AlphaFold2

python ipsae.py scores_rank_001.json unrelaxed_rank_001.pdb 15 15

AlphaFold3

python ipsae.py fold_model_full_data_0.json fold_model_0.cif 10 10

Boltz1

python ipsae.py pae_model_0.npz model_0.cif 10 10

Key parameters

Parameter Description Recommended
PAE file JSON (AF2/AF3) or NPZ (Boltz) Match predictor
Structure file PDB or CIF structure Match PAE
PAE cutoff Threshold for contacts 10-15
Distance cutoff Max CA-CA distance (A) 10-15

Output format

Two output files are generated:

Chain-pair scores (_chains.csv):

chain_A,chain_B,ipSAE_min,pDockQ,pDockQ2,LIS,n_contacts,interface_dist
A,B,0.72,0.65,0.58,0.45,42,8.5

Residue-level scores (_residues.csv):

chain,resnum,pSAE,pLDDT
A,45,0.85,92.3
A,67,0.78,88.1

Sample output

Successful run

$ python ipsae.py scores_rank_001.json design_0.pdb 10 10
Processing design_0...
Found 2 chains: A, B
Computing ipSAE scores...

Results written to:
  design_0_chains.csv
  design_0_residues.csv

Summary:
  ipSAE_min: 0.72
  pDockQ: 0.65
  LIS: 0.45
  Interface contacts: 42

What good output looks like:

  • ipSAE_min > 0.61 (primary filter)
  • pDockQ > 0.5 (supporting metric)
  • Reasonable number of interface contacts (20-100)

Decision tree

Should I use ipSAE?
├─ What are you ranking?
│  ├─ Designed binders → ipSAE ✓
│  ├─ Natural complexes → ipTM is fine
│  └─ Single proteins → Not applicable
├─ What predictor did you use?
│  ├─ AlphaFold2 → ipSAE ✓
│  ├─ AlphaFold3 → ipSAE ✓
│  ├─ Boltz1 → ipSAE ✓
│  ├─ Chai → ipSAE (use PAE output)
│  └─ ESMFold → Not applicable (no PAE)
└─ Why ipSAE over ipTM?
   ├─ Different length constructs → ipSAE ✓
   ├─ Designs with disordered regions → ipSAE ✓
   └─ Standard complexes → Either works

Recommended thresholds

Metric Standard Stringent Use Case
ipSAE_min > 0.61 > 0.70 Primary filter
LIS > 0.35 > 0.45 Interface quality
pDockQ > 0.5 > 0.6 Supporting

Batch processing

import subprocess
import os
from pathlib import Path

def score_designs(pae_dir, struct_dir, output_dir):
    """Score all designs in a directory."""
    Path(output_dir).mkdir(exist_ok=True)

    for pae_file in Path(pae_dir).glob("*_scores*.json"):
        name = pae_file.stem.replace("_scores_rank_001", "")
        struct_file = Path(struct_dir) / f"{name}.pdb"

        if struct_file.exists():
            subprocess.run([
                "python", "ipsae.py",
                str(pae_file),
                str(struct_file),
                "10", "10"
            ])

Verify

ls *_chains.csv | wc -l  # Should match number of predictions

Troubleshooting

Low scores for good designs: Check PAE/distance cutoffs Missing output: Verify PAE file format matches predictor Inconsistent scores: Use same cutoffs across all designs

Error interpretation

Error Cause Fix
KeyError: 'pae' Wrong PAE format Check if AF2/AF3/Boltz format
FileNotFoundError Structure not found Verify file paths
ValueError: no contacts No interface detected Check chain IDs, reduce cutoffs

Next: Select top designs (ipSAE_min > 0.61) → experimental validation.

Weekly Installs
19
GitHub Stars
114
First Seen
Jan 21, 2026
Installed on
claude-code16
opencode15
codex15
gemini-cli14
cursor13
github-copilot10