Modelling
Overview
This module provides utilities to post-process AlphaFold-Multimer predictions,
extract interface features, and rescore complexes with pDockQ/pDockQ2.
Typical workflows:
single model analysis (
single_analysis)per-entry processing across ranked models (
all_analysis)batch processing across many entry folders (
batch_analysis)
Examples
Single (direct) analysis
Analyze one predicted model (JSON + PDB):
from pepkit.modelling.af.post.analysis import Analysis
a = Analysis(
json_path="data/examples/7QWV_A_7QWV_B/7QWV_A_7QWV_B_scores_rank_001_alphafold2_multimer_v3_model_3_seed_000.json",
pdb_path="data/examples/7QWV_A_7QWV_B/7QWV_A_7QWV_B_relaxed_rank_001_alphafold2_multimer_v3_model_3_seed_000.pdb",
peptide_chain_position="last",
distance_cutoff=8.0,
)
result = a.single_analysis()
subset = {
"composite_ptm": result.get("composite_ptm"),
"pdockq": result.get("pdockq"),
"pdockq2": result.get("pdockq2"),
}
print(subset)
Example output:
{'composite_ptm': 0.82, 'pdockq': 0.16, 'pdockq2': 0.36}
Per-entry (all ranked models)
Process every ranked model in a folder:
a = Analysis(peptide_chain_position="last")
entry_result = a.all_analysis("data/examples/7QWV_A_7QWV_B/")
The returned entry_result is a dict with per-rank keys (rank001, rank002 …)
and meta keys such as length and processing_time.
Command line usage
Single entry
Module invocation:
python -m pepkit.modelling.af.post.analysis --entry_dir data/examples/7QWV_A_7QWV_B
Installed CLI:
pepkit postprocess --af-out data/examples/7QWV_A_7QWV_B --single-entry
The command writes result.json inside the entry directory (af-out/result.json).
Example af-out/result.json (valid JSON; truncated):
{
"rank001": {
"mean_plddt": 83.091,
"median_plddt": 95.0,
"peptide_plddt": 78.062,
"protein_interface_plddt": 87.986,
"peptide_interface_plddt": 78.062,
"interface_plddt": 83.024,
"mean_pae": 8.683,
"max_pae": 31.312,
"peptide_pae": 8.837,
"protein_interface_pae": 6.613,
"peptide_interface_pae": 8.837,
"mean_interface_pae": 4.408,
"pdockq": 0.16,
"pdockq2": 0.36,
"composite_ptm": 0.82
},
"rank002": {},
"length": 5,
"processing_time": 12.3
}
Batch (multiple entries)
Module invocation:
python -m pepkit.modelling.af.post.analysis --batch_dir data/examples
Installed CLI:
pepkit postprocess --af-out data/examples
The batch run processes all entry folders under the provided directory.
Turning results into a summary table
Collect pDockQ / pDockQ2 across ranks and build a DataFrame:
import json
import pandas as pd
from pathlib import Path
rows = []
for entry_dir in Path("data/examples").iterdir():
p = entry_dir / "result.json"
if not p.exists():
continue
data = json.loads(p.read_text())
for rank_key, record in data.items():
if not rank_key.startswith("rank"):
continue
rows.append({
"entry": entry_dir.name,
"rank": rank_key,
"pdockq": record.get("pdockq"),
"pdockq2": record.get("pdockq2"),
"composite_ptm": record.get("composite_ptm"),
})
df = pd.DataFrame(rows).sort_values(["entry", "rank"])
print(df.head())
Quick notes & tips
peptide_chain_position: use"last"if the peptide is the last chain in the PDB.distance_cutoff: interface radius in Å (example uses 8.0 Å).Output is plain JSON dicts; save
result.jsonper entry for reproducibility.
API quick reference
|
Constructor. Common args: |
|
Analyze a single JSON+PDB pair → metrics dict. |
|
Process ranked models in |
|
Process multiple entries under |