Protein structure prediction has been revolutionized by AI, but validation remains critical. A predicted structure is only as useful as it is accurate—and knowing how to assess that accuracy separates publishable science from wishful thinking. This comprehensive guide provides a systematic checklist for validating AlphaFold2 and ESMFold predictions, from basic confidence metrics to advanced structural quality assessment.
Why Validation Matters
Tier 1: Model Confidence Metrics
Start with the metrics provided by the prediction model itself. These are your first line of defense and can immediately flag problematic predictions.
1.1 pLDDT Scores (Per-Residue Confidence)
The predicted Local Distance Difference Test (pLDDT) is AlphaFold2's per-residue confidence metric, ranging from 0-100. It estimates how accurately the position of each residue's Cα atom is predicted.
Interpreting pLDDT
import json
import numpy as np
def analyze_plddt_distribution(pdb_file):
"""Extract and analyze pLDDT scores from AlphaFold2 PDB file."""
plddt_scores = []
with open(pdb_file, 'r') as f:
for line in f:
if line.startswith('ATOM'):
# pLDDT stored in B-factor column
plddt = float(line[60:66].strip())
plddt_scores.append(plddt)
scores = np.array(plddt_scores)
analysis = {
'mean_plddt': np.mean(scores),
'median_plddt': np.median(scores),
'high_confidence': np.sum(scores > 90) / len(scores) * 100,
'medium_confidence': np.sum((scores >= 70) & (scores <= 90)) / len(scores) * 100,
'low_confidence': np.sum(scores < 70) / len(scores) * 100,
'min_plddt': np.min(scores),
'std_plddt': np.std(scores)
}
return analysis
# Example usage
results = analyze_plddt_distribution('alphafold_prediction.pdb')
print(f"Overall confidence: {results['mean_plddt']:.1f}")
print(f"High confidence regions: {results['high_confidence']:.1f}%")
print(f"Low confidence regions: {results['low_confidence']:.1f}%")
# Flag for review if mean pLDDT < 70
if results['mean_plddt'] < 70:
print("⚠️ WARNING: Low overall confidence - prediction may be unreliable")
1.2 PAE (Predicted Aligned Error)
While pLDDT tells you about local accuracy, PAE reveals relative positioning confidence between residue pairs. This is essential for understanding domain organization and inter-domain relationships.
- Dark blue diagonal: Each residue is confident about its immediate neighbors
- Off-diagonal blue blocks: Confident domain-domain or subdomain-subdomain relationships
- Light blue/yellow regions: Uncertain relative positioning - domains may be incorrectly oriented
- Check PAE values at functional sites: Low PAE (<5Å) needed for reliable active site geometry
def validate_domain_confidence(pae_json, plddt_scores, domain_ranges):
"""
Validate that domains have both high internal confidence (pLDDT)
and confident relative positioning (PAE).
"""
with open(pae_json, 'r') as f:
pae_data = json.load(f)
pae_matrix = np.array(pae_data['predicted_aligned_error'])
validation_report = []
for domain_name, (start, end) in domain_ranges.items():
# Check internal domain confidence (pLDDT)
domain_plddt = np.mean(plddt_scores[start:end])
# Check internal PAE (should be low = confident)
internal_pae = pae_matrix[start:end, start:end]
mean_internal_pae = np.mean(internal_pae)
status = 'PASS' if domain_plddt > 70 and mean_internal_pae < 10 else 'FAIL'
validation_report.append({
'domain': domain_name,
'plddt': domain_plddt,
'internal_pae': mean_internal_pae,
'status': status
})
return validation_report
# Example: Validate a multi-domain protein
domains = {
'Kinase_domain': (1, 280),
'SH3_domain': (300, 360),
'SH2_domain': (380, 470)
}
report = validate_domain_confidence('pae.json', plddt_scores, domains)
for entry in report:
print(f"{entry['domain']}: {entry['status']} "
f"(pLDDT={entry['plddt']:.1f}, PAE={entry['internal_pae']:.1f}Å)")
1.3 Multiple Model Consistency
AlphaFold2 generates multiple predictions (typically 5 models). Consistency across models indicates robust prediction; variation suggests uncertainty.
from Bio.PDB import PDBParser, Superimposer
import numpy as np
def calculate_model_agreement(model_files):
"""Calculate RMSD between all model pairs."""
parser = PDBParser(QUIET=True)
structures = [parser.get_structure(f'model_{i}', f) for i, f in enumerate(model_files)]
# Get CA atoms from each model
ca_atoms_list = []
for structure in structures:
ca_atoms = [atom for atom in structure.get_atoms() if atom.get_name() == 'CA']
ca_atoms_list.append(ca_atoms)
# Calculate pairwise RMSD
rmsds = []
sup = Superimposer()
for i in range(len(ca_atoms_list)):
for j in range(i+1, len(ca_atoms_list)):
sup.set_atoms(ca_atoms_list[i], ca_atoms_list[j])
rmsds.append(sup.rms)
mean_rmsd = np.mean(rmsds)
max_rmsd = np.max(rmsds)
# Interpret results
if mean_rmsd < 1.0:
confidence = "VERY HIGH - Models are highly consistent"
elif mean_rmsd < 3.0:
confidence = "MODERATE - Some structural variation"
else:
confidence = "LOW - Significant disagreement between models"
return {
'mean_rmsd': mean_rmsd,
'max_rmsd': max_rmsd,
'confidence': confidence,
'rmsds': rmsds
}
# Example
model_files = [f'ranked_{i}.pdb' for i in range(5)]
agreement = calculate_model_agreement(model_files)
print(f"Model agreement: {agreement['mean_rmsd']:.2f}Å (±{np.std(agreement['rmsds']):.2f}Å)")
print(f"Assessment: {agreement['confidence']}")
Tier 2: Structural Geometry Validation
Even with high model confidence, check that the structure obeys basic physical and chemical constraints. These validations catch errors that confidence metrics might miss.
2.1 Ramachandran Plot Analysis
The Ramachandran plot shows the distribution of backbone dihedral angles (φ, ψ). Most residues should fall in energetically favorable regions.
Expected Quality Standards
from Bio.PDB import PDBParser, PPBuilder
import numpy as np
def ramachandran_validation(pdb_file):
"""
Calculate phi/psi angles and check Ramachandran distribution.
Uses simplified favored region boundaries.
"""
parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', pdb_file)
ppb = PPBuilder()
angles = []
outliers = []
for pp in ppb.build_peptides(structure):
phi_psi = pp.get_phi_psi_list()
for i, (phi, psi) in enumerate(phi_psi):
if phi is not None and psi is not None:
phi_deg = np.degrees(phi)
psi_deg = np.degrees(psi)
angles.append((phi_deg, psi_deg))
# Simplified favored region check
# (Real validation uses more sophisticated boundaries)
in_favored = (
(-180 <= phi_deg <= -30 and -180 <= psi_deg <= 50) or # Beta sheet
(-90 <= phi_deg <= -30 and -70 <= psi_deg <= 30) or # Right-handed alpha helix
(30 <= phi_deg <= 90 and -30 <= psi_deg <= 90) # Left-handed alpha helix
)
if not in_favored:
residue = list(pp)[i]
outliers.append({
'residue': residue.get_id()[1],
'phi': phi_deg,
'psi': psi_deg
})
favored_percent = ((len(angles) - len(outliers)) / len(angles)) * 100
return {
'total_residues': len(angles),
'outliers': len(outliers),
'favored_percent': favored_percent,
'outlier_details': outliers[:10] # First 10 outliers
}
rama = ramachandran_validation('prediction.pdb')
print(f"Ramachandran validation: {rama['favored_percent']:.1f}% in favored regions")
if rama['favored_percent'] < 95:
print(f"⚠️ WARNING: {rama['outliers']} outliers detected")
print("Review these residues manually - may indicate local geometry errors")
2.2 Steric Clashes
Check for atoms that are too close together. Serious clashes (<2.0Å between non-bonded atoms) indicate physically impossible geometries.
from Bio.PDB import NeighborSearch
import numpy as np
def detect_steric_clashes(pdb_file, clash_distance=2.0):
"""
Detect atoms that are unrealistically close together.
"""
parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', pdb_file)
# Get all atoms
atoms = [atom for atom in structure.get_atoms()]
ns = NeighborSearch(atoms)
clashes = []
for atom in atoms:
# Find nearby atoms
nearby = ns.search(atom.coord, clash_distance, level='A')
for neighbor in nearby:
# Skip same residue and bonded atoms
if (atom.get_parent() != neighbor.get_parent() and
atom != neighbor):
distance = atom - neighbor
if distance < clash_distance:
clashes.append({
'atom1': f"{atom.get_parent().get_id()[1]}{atom.get_parent().get_resname()}:{atom.get_name()}",
'atom2': f"{neighbor.get_parent().get_id()[1]}{neighbor.get_parent().get_resname()}:{neighbor.get_name()}",
'distance': distance
})
# Remove duplicates
unique_clashes = []
seen = set()
for clash in clashes:
pair = tuple(sorted([clash['atom1'], clash['atom2']]))
if pair not in seen:
seen.add(pair)
unique_clashes.append(clash)
return unique_clashes
clashes = detect_steric_clashes('prediction.pdb')
print(f"Found {len(clashes)} steric clashes")
serious_clashes = [c for c in clashes if c['distance'] < 1.5]
if serious_clashes:
print(f"⚠️ {len(serious_clashes)} SERIOUS clashes (<1.5Å)")
for clash in serious_clashes[:5]:
print(f" {clash['atom1']} ↔ {clash['atom2']}: {clash['distance']:.2f}Å")
2.3 Bond Length and Angle Validation
Check that covalent bonds have reasonable lengths and angles. AlphaFold2 typically produces good geometry, but post-processing or file format issues can introduce errors.
Standard Bond Lengths (Å)
Using MolProbity
Tier 3: Biological Plausibility
Beyond geometric correctness, validate that the structure makes biological sense based on experimental data and known protein biology.
3.1 Comparison with Homologous Structures
If experimental structures of homologous proteins exist, compare them to your prediction. High-identity homologs (>50% sequence identity) should have very similar structures.
import requests
from Bio.PDB import PDBParser, Superimposer
from Bio import pairwise2
def validate_against_homologs(prediction_pdb, uniprot_id):
"""
Find PDB structures of homologous proteins and compare.
"""
# Search PDBe for structures
url = f"https://www.ebi.ac.uk/pdbe/api/mappings/uniprot/{uniprot_id}"
response = requests.get(url)
if response.status_code != 200:
return "No homologous structures found"
data = response.json()
pdb_entries = list(data[uniprot_id]['PDB'].keys())[:5] # Top 5
parser = PDBParser(QUIET=True)
pred_structure = parser.get_structure('prediction', prediction_pdb)
pred_ca = [atom for atom in pred_structure.get_atoms() if atom.get_name() == 'CA']
comparisons = []
for pdb_id in pdb_entries:
# Download PDB structure
pdb_url = f"https://files.rcsb.org/download/{pdb_id}.pdb"
pdb_response = requests.get(pdb_url)
if pdb_response.status_code == 200:
# Save temporarily and parse
with open(f'/tmp/{pdb_id}.pdb', 'w') as f:
f.write(pdb_response.text)
exp_structure = parser.get_structure(pdb_id, f'/tmp/{pdb_id}.pdb')
exp_ca = [atom for atom in exp_structure.get_atoms() if atom.get_name() == 'CA']
# Align and calculate RMSD
sup = Superimposer()
min_len = min(len(pred_ca), len(exp_ca))
sup.set_atoms(pred_ca[:min_len], exp_ca[:min_len])
comparisons.append({
'pdb_id': pdb_id,
'rmsd': sup.rms,
'aligned_residues': min_len
})
return comparisons
# Example validation
comparisons = validate_against_homologs('prediction.pdb', 'P12345')
for comp in comparisons:
print(f"{comp['pdb_id']}: RMSD = {comp['rmsd']:.2f}Å over {comp['aligned_residues']} residues")
if comp['rmsd'] < 2.0:
print(" ✓ Excellent agreement with experimental structure")
elif comp['rmsd'] < 4.0:
print(" ~ Moderate agreement - check divergent regions")
else:
print(" ⚠️ Poor agreement - prediction may be incorrect")
3.2 Active Site and Functional Region Validation
For enzymes and binding proteins, validate that functional residues are properly positioned. Catalytic triads, binding pockets, and other functional motifs should have correct geometry.
- Catalytic residues: Check distances between active site residues match known values (typically 3-4Å for catalytic dyads/triads)
- Binding pockets: Validate pocket volume and electrostatic properties match ligand requirements
- Disulfide bonds: Verify Cys-Cys distances are ~2.0Å for predicted disulfides
- Metal coordination: Check geometry of metal-binding sites (histidine, cysteine, aspartate clusters)
- Post-translational modifications: Validate that modification sites are surface-accessible
3.3 Oligomeric State Validation
If your protein forms oligomers, validate the predicted complex:
def validate_protein_interface(pdb_file, chain_A='A', chain_B='B'):
"""
Analyze protein-protein interface quality.
"""
parser = PDBParser(QUIET=True)
structure = parser.get_structure('complex', pdb_file)
chain_a_atoms = [atom for atom in structure[0][chain_A].get_atoms()]
chain_b_atoms = [atom for atom in structure[0][chain_B].get_atoms()]
# Find interface residues (within 5Å of other chain)
ns_a = NeighborSearch(chain_a_atoms)
ns_b = NeighborSearch(chain_b_atoms)
interface_a = set()
interface_b = set()
for atom in chain_a_atoms:
nearby = ns_b.search(atom.coord, 5.0, level='R')
if nearby:
interface_a.add(atom.get_parent())
for atom in chain_b_atoms:
nearby = ns_a.search(atom.coord, 5.0, level='R')
if nearby:
interface_b.add(atom.get_parent())
interface_area = len(interface_a) + len(interface_b)
# Typical protein interface: 1200-2000 Ų buried surface area
# Rough estimate: ~40-70 residues in interface
validation = {
'interface_residues_A': len(interface_a),
'interface_residues_B': len(interface_b),
'total_interface': interface_area,
'quality': 'Good' if 40 <= interface_area <= 100 else 'Suspicious'
}
return validation
interface = validate_protein_interface('complex.pdb')
print(f"Interface size: {interface['total_interface']} residues")
print(f"Quality assessment: {interface['quality']}")
if interface['total_interface'] < 20:
print("⚠️ Very small interface - may not represent real biological assembly")
3.4 Disorder and Flexibility Assessment
Low pLDDT regions may represent intrinsically disordered regions (IDRs) rather than prediction errors. Validate against disorder predictors:
Cross-Validation Tools
- IUPred3: Predict intrinsically unstructured regions from sequence
- AlphaFold2 disorder prediction: Low pLDDT regions that align with IUPred predictions are likely true disorder
- PONDR: Alternative disorder predictor for validation
- MobiDB: Database of known disorder regions in homologs
Tier 4: Experimental Data Integration
The ultimate validation comes from comparing predictions against experimental data. Even limited experimental information can validate or invalidate predictions.
4.1 Mutagenesis Data
Published mutation studies provide powerful validation:
- Loss-of-function mutations: Should be in or near active sites, binding interfaces, or structural cores
- Neutral mutations: Should be surface-exposed and away from functional regions
- Destabilizing mutations: Should disrupt hydrophobic cores or key salt bridges
- Gain-of-function mutations: Validate that the structural context supports the proposed mechanism
4.2 Cross-Linking Mass Spectrometry (XL-MS)
XL-MS provides distance constraints between lysine pairs. Predicted structures must satisfy these experimental distances.
def validate_crosslinks(pdb_file, crosslinks):
"""
Validate structure against XL-MS distance constraints.
Typical lysine cross-linkers span ~25-30Å (BS3, DSS).
crosslinks: List of tuples [(res1, res2, max_distance), ...]
"""
parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', pdb_file)
violations = []
satisfied = []
for res1, res2, max_dist in crosslinks:
# Get CA atoms of residues
try:
atom1 = structure[0]['A'][res1]['CA']
atom2 = structure[0]['A'][res2]['CA']
distance = atom1 - atom2
if distance <= max_dist:
satisfied.append((res1, res2, distance, max_dist))
else:
violations.append((res1, res2, distance, max_dist))
except KeyError:
print(f"Residue {res1} or {res2} not found in structure")
return satisfied, violations
# Example: Validate against published XL-MS data
xl_data = [
(45, 123, 30.0), # K45-K123: BS3 cross-link
(78, 201, 30.0), # K78-K201: BS3 cross-link
(134, 156, 25.0), # K134-K156: DSS cross-link
]
satisfied, violations = validate_crosslinks('prediction.pdb', xl_data)
print(f"Cross-link validation: {len(satisfied)}/{len(xl_data)} satisfied")
if violations:
print(f"⚠️ {len(violations)} violations detected:")
for res1, res2, dist, max_dist in violations:
print(f" K{res1}-K{res2}: {dist:.1f}Å (max: {max_dist}Å)")
4.3 Small-Angle X-ray Scattering (SAXS)
SAXS provides overall shape information. Validate that your predicted structure's radius of gyration (Rg) and maximum dimension (Dmax) match experimental values.
4.4 Hydrogen-Deuterium Exchange (HDX)
HDX-MS reports solvent accessibility and dynamics. Fast-exchanging regions should be surface-exposed; slow-exchanging regions should be buried in the core or involved in stable secondary structures.
Complete Validation Checklist
Use this systematic checklist for every predicted structure:
Model Confidence
- Calculate mean pLDDT score (>70 for reliable predictions)
- Identify low-confidence regions (<50 pLDDT)
- Analyze PAE matrix for domain organization
- Check consistency across multiple AlphaFold2 models (RMSD <2Å)
- For multimers: validate ipTM score (>0.5 for confident interfaces)
Structural Geometry
- Ramachandran plot: >98% residues in favored regions
- No serious steric clashes (<2.0Å)
- Bond lengths and angles within expected ranges
- Rotamer outliers <2%
- MolProbity score >2.0 (if available)
Biological Plausibility
- Compare with homologous structures (RMSD <4Å for close homologs)
- Validate active site geometry (if applicable)
- Check oligomeric state makes biological sense
- Verify disulfide bonds and metal coordination sites
- Confirm low pLDDT regions match disorder predictions
Experimental Validation
- Cross-reference with mutagenesis data
- Validate XL-MS distance constraints (if available)
- Compare Rg and Dmax with SAXS data (if available)
- Check HDX protection patterns (if available)
- Verify binding site predictions with biochemical data
Making Decisions: When to Trust Your Prediction
High Confidence
- Mean pLDDT > 80
- Model RMSD < 1.5Å
- Ramachandran > 98% favored
- Matches homolog (if available)
- No serious geometry errors
Decision: Suitable for structure-based drug design, detailed mechanistic studies, and publication.
Medium Confidence
- Mean pLDDT 60-80
- Some model disagreement
- Minor geometry issues
- Partial match with homologs
- Localized low confidence
Decision: Use with caution. Suitable for hypothesis generation, guiding experiments, avoiding high-confidence regions for critical analysis.
Low Confidence
- Mean pLDDT < 60
- High model disagreement
- Ramachandran outliers
- Poor homolog match
- Widespread low pLDDT
Decision: Do not use for detailed analysis. May indicate intrinsic disorder, missing cofactors, or prediction failure. Consider alternative approaches.
Common Validation Pitfalls to Avoid
Pitfall 1: Ignoring Low-Confidence Regions
Pitfall 2: Over-Interpreting Side Chain Positions
Pitfall 3: Assuming High pLDDT = Functionally Relevant
Pitfall 4: Neglecting Oligomeric State
Pitfall 5: Confirmation Bias
Conclusion: Validation as Scientific Practice
Protein structure prediction has become remarkably accurate, but validation remains a critical scientific practice. Think of AlphaFold2 and ESMFold as sophisticated hypotheses that must be tested against physical constraints, biological knowledge, and experimental data.
A well-validated structure—even with moderate confidence—is far more valuable than a poorly validated high-confidence prediction. By systematically applying this validation checklist, you ensure that your structural insights are built on solid foundations and can withstand scientific scrutiny.
Key Takeaway
Next Steps
- Download our validation checklist template to systematically assess your predictions
- Explore our structure viewer to visualize confidence metrics interactively
- Run your own predictions with AlphaFold2 on Protogen Bio with automated validation reports
- Join our community forum to discuss validation strategies and challenging cases