Validation Checklist for Predicted Structures
Back to Academy
advanced30 min read

Validation Checklist for Predicted Structures

A comprehensive quality assurance guide: from confidence metrics to experimental validation, ensuring your AlphaFold2 predictions are publication-ready.

P

Protogen Team

Structural Biologists

January 28, 2025

Protein structure prediction has been revolutionized by AI, but validation remains critical. A predicted structure is only as useful as it is accurate—and knowing how to assess that accuracy separates publishable science from wishful thinking. This comprehensive guide provides a systematic checklist for validating AlphaFold2 and ESMFold predictions, from basic confidence metrics to advanced structural quality assessment.

Why Validation Matters

Even high-confidence predictions can contain local errors, domain misplacements, or physically impossible geometries. Published structures in the PDB undergo rigorous validation—your predicted structures deserve the same scrutiny. This is especially critical when using predictions for downstream applications like drug design, mutagenesis studies, or mechanistic hypotheses.

Tier 1: Model Confidence Metrics

Start with the metrics provided by the prediction model itself. These are your first line of defense and can immediately flag problematic predictions.

1.1 pLDDT Scores (Per-Residue Confidence)

The predicted Local Distance Difference Test (pLDDT) is AlphaFold2's per-residue confidence metric, ranging from 0-100. It estimates how accurately the position of each residue's Cα atom is predicted.

High Confidence
> 90
Very accurate prediction, typically within 1Å of true structure
Medium Confidence
70-90
Generally correct backbone, some side-chain uncertainty
Low Confidence
< 70
High uncertainty, likely disordered or incorrect

Interpreting pLDDT

Look for continuous stretches rather than individual residues. A region with consistently high pLDDT (>90) is likely accurate. Isolated low-pLDDT residues in otherwise high-confidence regions often represent genuine flexibility rather than prediction errors.
python
import json
import numpy as np

def analyze_plddt_distribution(pdb_file):
    """Extract and analyze pLDDT scores from AlphaFold2 PDB file."""
    plddt_scores = []

    with open(pdb_file, 'r') as f:
        for line in f:
            if line.startswith('ATOM'):
                # pLDDT stored in B-factor column
                plddt = float(line[60:66].strip())
                plddt_scores.append(plddt)

    scores = np.array(plddt_scores)

    analysis = {
        'mean_plddt': np.mean(scores),
        'median_plddt': np.median(scores),
        'high_confidence': np.sum(scores > 90) / len(scores) * 100,
        'medium_confidence': np.sum((scores >= 70) & (scores <= 90)) / len(scores) * 100,
        'low_confidence': np.sum(scores < 70) / len(scores) * 100,
        'min_plddt': np.min(scores),
        'std_plddt': np.std(scores)
    }

    return analysis

# Example usage
results = analyze_plddt_distribution('alphafold_prediction.pdb')
print(f"Overall confidence: {results['mean_plddt']:.1f}")
print(f"High confidence regions: {results['high_confidence']:.1f}%")
print(f"Low confidence regions: {results['low_confidence']:.1f}%")

# Flag for review if mean pLDDT < 70
if results['mean_plddt'] < 70:
    print("⚠️  WARNING: Low overall confidence - prediction may be unreliable")

1.2 PAE (Predicted Aligned Error)

While pLDDT tells you about local accuracy, PAE reveals relative positioning confidence between residue pairs. This is essential for understanding domain organization and inter-domain relationships.

  • Dark blue diagonal: Each residue is confident about its immediate neighbors
  • Off-diagonal blue blocks: Confident domain-domain or subdomain-subdomain relationships
  • Light blue/yellow regions: Uncertain relative positioning - domains may be incorrectly oriented
  • Check PAE values at functional sites: Low PAE (<5Å) needed for reliable active site geometry
python
def validate_domain_confidence(pae_json, plddt_scores, domain_ranges):
    """
    Validate that domains have both high internal confidence (pLDDT)
    and confident relative positioning (PAE).
    """
    with open(pae_json, 'r') as f:
        pae_data = json.load(f)

    pae_matrix = np.array(pae_data['predicted_aligned_error'])

    validation_report = []

    for domain_name, (start, end) in domain_ranges.items():
        # Check internal domain confidence (pLDDT)
        domain_plddt = np.mean(plddt_scores[start:end])

        # Check internal PAE (should be low = confident)
        internal_pae = pae_matrix[start:end, start:end]
        mean_internal_pae = np.mean(internal_pae)

        status = 'PASS' if domain_plddt > 70 and mean_internal_pae < 10 else 'FAIL'

        validation_report.append({
            'domain': domain_name,
            'plddt': domain_plddt,
            'internal_pae': mean_internal_pae,
            'status': status
        })

    return validation_report

# Example: Validate a multi-domain protein
domains = {
    'Kinase_domain': (1, 280),
    'SH3_domain': (300, 360),
    'SH2_domain': (380, 470)
}

report = validate_domain_confidence('pae.json', plddt_scores, domains)
for entry in report:
    print(f"{entry['domain']}: {entry['status']} "
          f"(pLDDT={entry['plddt']:.1f}, PAE={entry['internal_pae']:.1f}Å)")

1.3 Multiple Model Consistency

AlphaFold2 generates multiple predictions (typically 5 models). Consistency across models indicates robust prediction; variation suggests uncertainty.

python
from Bio.PDB import PDBParser, Superimposer
import numpy as np

def calculate_model_agreement(model_files):
    """Calculate RMSD between all model pairs."""
    parser = PDBParser(QUIET=True)
    structures = [parser.get_structure(f'model_{i}', f) for i, f in enumerate(model_files)]

    # Get CA atoms from each model
    ca_atoms_list = []
    for structure in structures:
        ca_atoms = [atom for atom in structure.get_atoms() if atom.get_name() == 'CA']
        ca_atoms_list.append(ca_atoms)

    # Calculate pairwise RMSD
    rmsds = []
    sup = Superimposer()

    for i in range(len(ca_atoms_list)):
        for j in range(i+1, len(ca_atoms_list)):
            sup.set_atoms(ca_atoms_list[i], ca_atoms_list[j])
            rmsds.append(sup.rms)

    mean_rmsd = np.mean(rmsds)
    max_rmsd = np.max(rmsds)

    # Interpret results
    if mean_rmsd < 1.0:
        confidence = "VERY HIGH - Models are highly consistent"
    elif mean_rmsd < 3.0:
        confidence = "MODERATE - Some structural variation"
    else:
        confidence = "LOW - Significant disagreement between models"

    return {
        'mean_rmsd': mean_rmsd,
        'max_rmsd': max_rmsd,
        'confidence': confidence,
        'rmsds': rmsds
    }

# Example
model_files = [f'ranked_{i}.pdb' for i in range(5)]
agreement = calculate_model_agreement(model_files)
print(f"Model agreement: {agreement['mean_rmsd']:.2f}Å (±{np.std(agreement['rmsds']):.2f}Å)")
print(f"Assessment: {agreement['confidence']}")

Tier 2: Structural Geometry Validation

Even with high model confidence, check that the structure obeys basic physical and chemical constraints. These validations catch errors that confidence metrics might miss.

2.1 Ramachandran Plot Analysis

The Ramachandran plot shows the distribution of backbone dihedral angles (φ, ψ). Most residues should fall in energetically favorable regions.

Expected Quality Standards

For a high-quality structure: >98% of residues in favored regions, >99.5% in allowed regions. Outliers should be in functionally important regions (active sites, binding interfaces) where unusual geometries may be real.
python
from Bio.PDB import PDBParser, PPBuilder
import numpy as np

def ramachandran_validation(pdb_file):
    """
    Calculate phi/psi angles and check Ramachandran distribution.
    Uses simplified favored region boundaries.
    """
    parser = PDBParser(QUIET=True)
    structure = parser.get_structure('protein', pdb_file)
    ppb = PPBuilder()

    angles = []
    outliers = []

    for pp in ppb.build_peptides(structure):
        phi_psi = pp.get_phi_psi_list()
        for i, (phi, psi) in enumerate(phi_psi):
            if phi is not None and psi is not None:
                phi_deg = np.degrees(phi)
                psi_deg = np.degrees(psi)
                angles.append((phi_deg, psi_deg))

                # Simplified favored region check
                # (Real validation uses more sophisticated boundaries)
                in_favored = (
                    (-180 <= phi_deg <= -30 and -180 <= psi_deg <= 50) or  # Beta sheet
                    (-90 <= phi_deg <= -30 and -70 <= psi_deg <= 30) or     # Right-handed alpha helix
                    (30 <= phi_deg <= 90 and -30 <= psi_deg <= 90)          # Left-handed alpha helix
                )

                if not in_favored:
                    residue = list(pp)[i]
                    outliers.append({
                        'residue': residue.get_id()[1],
                        'phi': phi_deg,
                        'psi': psi_deg
                    })

    favored_percent = ((len(angles) - len(outliers)) / len(angles)) * 100

    return {
        'total_residues': len(angles),
        'outliers': len(outliers),
        'favored_percent': favored_percent,
        'outlier_details': outliers[:10]  # First 10 outliers
    }

rama = ramachandran_validation('prediction.pdb')
print(f"Ramachandran validation: {rama['favored_percent']:.1f}% in favored regions")

if rama['favored_percent'] < 95:
    print(f"⚠️  WARNING: {rama['outliers']} outliers detected")
    print("Review these residues manually - may indicate local geometry errors")

2.2 Steric Clashes

Check for atoms that are too close together. Serious clashes (<2.0Å between non-bonded atoms) indicate physically impossible geometries.

python
from Bio.PDB import NeighborSearch
import numpy as np

def detect_steric_clashes(pdb_file, clash_distance=2.0):
    """
    Detect atoms that are unrealistically close together.
    """
    parser = PDBParser(QUIET=True)
    structure = parser.get_structure('protein', pdb_file)

    # Get all atoms
    atoms = [atom for atom in structure.get_atoms()]
    ns = NeighborSearch(atoms)

    clashes = []

    for atom in atoms:
        # Find nearby atoms
        nearby = ns.search(atom.coord, clash_distance, level='A')

        for neighbor in nearby:
            # Skip same residue and bonded atoms
            if (atom.get_parent() != neighbor.get_parent() and
                atom != neighbor):

                distance = atom - neighbor
                if distance < clash_distance:
                    clashes.append({
                        'atom1': f"{atom.get_parent().get_id()[1]}{atom.get_parent().get_resname()}:{atom.get_name()}",
                        'atom2': f"{neighbor.get_parent().get_id()[1]}{neighbor.get_parent().get_resname()}:{neighbor.get_name()}",
                        'distance': distance
                    })

    # Remove duplicates
    unique_clashes = []
    seen = set()
    for clash in clashes:
        pair = tuple(sorted([clash['atom1'], clash['atom2']]))
        if pair not in seen:
            seen.add(pair)
            unique_clashes.append(clash)

    return unique_clashes

clashes = detect_steric_clashes('prediction.pdb')
print(f"Found {len(clashes)} steric clashes")

serious_clashes = [c for c in clashes if c['distance'] < 1.5]
if serious_clashes:
    print(f"⚠️  {len(serious_clashes)} SERIOUS clashes (&lt;1.5Å)")
    for clash in serious_clashes[:5]:
        print(f"  {clash['atom1']} ↔ {clash['atom2']}: {clash['distance']:.2f}Å")

2.3 Bond Length and Angle Validation

Check that covalent bonds have reasonable lengths and angles. AlphaFold2 typically produces good geometry, but post-processing or file format issues can introduce errors.

Standard Bond Lengths (Å)

C-C
1.53 ± 0.02
C-N
1.47 ± 0.02
C=O
1.23 ± 0.02
C-O
1.43 ± 0.02

Using MolProbity

For comprehensive geometry validation, use MolProbity (http://molprobity.biochem.duke.edu/). Upload your PDB file to get detailed reports on Ramachandran outliers, rotamer analysis, clashes, and an overall quality score. Aim for MolProbity scores >2.0 (better than average PDB structures).

Tier 3: Biological Plausibility

Beyond geometric correctness, validate that the structure makes biological sense based on experimental data and known protein biology.

3.1 Comparison with Homologous Structures

If experimental structures of homologous proteins exist, compare them to your prediction. High-identity homologs (>50% sequence identity) should have very similar structures.

python
import requests
from Bio.PDB import PDBParser, Superimposer
from Bio import pairwise2

def validate_against_homologs(prediction_pdb, uniprot_id):
    """
    Find PDB structures of homologous proteins and compare.
    """
    # Search PDBe for structures
    url = f"https://www.ebi.ac.uk/pdbe/api/mappings/uniprot/{uniprot_id}"
    response = requests.get(url)

    if response.status_code != 200:
        return "No homologous structures found"

    data = response.json()
    pdb_entries = list(data[uniprot_id]['PDB'].keys())[:5]  # Top 5

    parser = PDBParser(QUIET=True)
    pred_structure = parser.get_structure('prediction', prediction_pdb)
    pred_ca = [atom for atom in pred_structure.get_atoms() if atom.get_name() == 'CA']

    comparisons = []

    for pdb_id in pdb_entries:
        # Download PDB structure
        pdb_url = f"https://files.rcsb.org/download/{pdb_id}.pdb"
        pdb_response = requests.get(pdb_url)

        if pdb_response.status_code == 200:
            # Save temporarily and parse
            with open(f'/tmp/{pdb_id}.pdb', 'w') as f:
                f.write(pdb_response.text)

            exp_structure = parser.get_structure(pdb_id, f'/tmp/{pdb_id}.pdb')
            exp_ca = [atom for atom in exp_structure.get_atoms() if atom.get_name() == 'CA']

            # Align and calculate RMSD
            sup = Superimposer()
            min_len = min(len(pred_ca), len(exp_ca))
            sup.set_atoms(pred_ca[:min_len], exp_ca[:min_len])

            comparisons.append({
                'pdb_id': pdb_id,
                'rmsd': sup.rms,
                'aligned_residues': min_len
            })

    return comparisons

# Example validation
comparisons = validate_against_homologs('prediction.pdb', 'P12345')
for comp in comparisons:
    print(f"{comp['pdb_id']}: RMSD = {comp['rmsd']:.2f}Å over {comp['aligned_residues']} residues")

    if comp['rmsd'] < 2.0:
        print("  ✓ Excellent agreement with experimental structure")
    elif comp['rmsd'] < 4.0:
        print("  ~ Moderate agreement - check divergent regions")
    else:
        print("  ⚠️  Poor agreement - prediction may be incorrect")

3.2 Active Site and Functional Region Validation

For enzymes and binding proteins, validate that functional residues are properly positioned. Catalytic triads, binding pockets, and other functional motifs should have correct geometry.

  • Catalytic residues: Check distances between active site residues match known values (typically 3-4Å for catalytic dyads/triads)
  • Binding pockets: Validate pocket volume and electrostatic properties match ligand requirements
  • Disulfide bonds: Verify Cys-Cys distances are ~2.0Å for predicted disulfides
  • Metal coordination: Check geometry of metal-binding sites (histidine, cysteine, aspartate clusters)
  • Post-translational modifications: Validate that modification sites are surface-accessible

3.3 Oligomeric State Validation

If your protein forms oligomers, validate the predicted complex:

python
def validate_protein_interface(pdb_file, chain_A='A', chain_B='B'):
    """
    Analyze protein-protein interface quality.
    """
    parser = PDBParser(QUIET=True)
    structure = parser.get_structure('complex', pdb_file)

    chain_a_atoms = [atom for atom in structure[0][chain_A].get_atoms()]
    chain_b_atoms = [atom for atom in structure[0][chain_B].get_atoms()]

    # Find interface residues (within 5Å of other chain)
    ns_a = NeighborSearch(chain_a_atoms)
    ns_b = NeighborSearch(chain_b_atoms)

    interface_a = set()
    interface_b = set()

    for atom in chain_a_atoms:
        nearby = ns_b.search(atom.coord, 5.0, level='R')
        if nearby:
            interface_a.add(atom.get_parent())

    for atom in chain_b_atoms:
        nearby = ns_a.search(atom.coord, 5.0, level='R')
        if nearby:
            interface_b.add(atom.get_parent())

    interface_area = len(interface_a) + len(interface_b)

    # Typical protein interface: 1200-2000 Ų buried surface area
    # Rough estimate: ~40-70 residues in interface

    validation = {
        'interface_residues_A': len(interface_a),
        'interface_residues_B': len(interface_b),
        'total_interface': interface_area,
        'quality': 'Good' if 40 <= interface_area <= 100 else 'Suspicious'
    }

    return validation

interface = validate_protein_interface('complex.pdb')
print(f"Interface size: {interface['total_interface']} residues")
print(f"Quality assessment: {interface['quality']}")

if interface['total_interface'] < 20:
    print("⚠️  Very small interface - may not represent real biological assembly")

3.4 Disorder and Flexibility Assessment

Low pLDDT regions may represent intrinsically disordered regions (IDRs) rather than prediction errors. Validate against disorder predictors:

Cross-Validation Tools

  1. IUPred3: Predict intrinsically unstructured regions from sequence
  2. AlphaFold2 disorder prediction: Low pLDDT regions that align with IUPred predictions are likely true disorder
  3. PONDR: Alternative disorder predictor for validation
  4. MobiDB: Database of known disorder regions in homologs

Tier 4: Experimental Data Integration

The ultimate validation comes from comparing predictions against experimental data. Even limited experimental information can validate or invalidate predictions.

4.1 Mutagenesis Data

Published mutation studies provide powerful validation:

  • Loss-of-function mutations: Should be in or near active sites, binding interfaces, or structural cores
  • Neutral mutations: Should be surface-exposed and away from functional regions
  • Destabilizing mutations: Should disrupt hydrophobic cores or key salt bridges
  • Gain-of-function mutations: Validate that the structural context supports the proposed mechanism

4.2 Cross-Linking Mass Spectrometry (XL-MS)

XL-MS provides distance constraints between lysine pairs. Predicted structures must satisfy these experimental distances.

python
def validate_crosslinks(pdb_file, crosslinks):
    """
    Validate structure against XL-MS distance constraints.
    Typical lysine cross-linkers span ~25-30Å (BS3, DSS).

    crosslinks: List of tuples [(res1, res2, max_distance), ...]
    """
    parser = PDBParser(QUIET=True)
    structure = parser.get_structure('protein', pdb_file)

    violations = []
    satisfied = []

    for res1, res2, max_dist in crosslinks:
        # Get CA atoms of residues
        try:
            atom1 = structure[0]['A'][res1]['CA']
            atom2 = structure[0]['A'][res2]['CA']

            distance = atom1 - atom2

            if distance <= max_dist:
                satisfied.append((res1, res2, distance, max_dist))
            else:
                violations.append((res1, res2, distance, max_dist))
        except KeyError:
            print(f"Residue {res1} or {res2} not found in structure")

    return satisfied, violations

# Example: Validate against published XL-MS data
xl_data = [
    (45, 123, 30.0),   # K45-K123: BS3 cross-link
    (78, 201, 30.0),   # K78-K201: BS3 cross-link
    (134, 156, 25.0),  # K134-K156: DSS cross-link
]

satisfied, violations = validate_crosslinks('prediction.pdb', xl_data)

print(f"Cross-link validation: {len(satisfied)}/{len(xl_data)} satisfied")
if violations:
    print(f"⚠️  {len(violations)} violations detected:")
    for res1, res2, dist, max_dist in violations:
        print(f"  K{res1}-K{res2}: {dist:.1f}Å (max: {max_dist}Å)")

4.3 Small-Angle X-ray Scattering (SAXS)

SAXS provides overall shape information. Validate that your predicted structure's radius of gyration (Rg) and maximum dimension (Dmax) match experimental values.

4.4 Hydrogen-Deuterium Exchange (HDX)

HDX-MS reports solvent accessibility and dynamics. Fast-exchanging regions should be surface-exposed; slow-exchanging regions should be buried in the core or involved in stable secondary structures.

Complete Validation Checklist

Use this systematic checklist for every predicted structure:

1

Model Confidence

  • Calculate mean pLDDT score (>70 for reliable predictions)
  • Identify low-confidence regions (<50 pLDDT)
  • Analyze PAE matrix for domain organization
  • Check consistency across multiple AlphaFold2 models (RMSD <2Å)
  • For multimers: validate ipTM score (>0.5 for confident interfaces)
2

Structural Geometry

  • Ramachandran plot: >98% residues in favored regions
  • No serious steric clashes (<2.0Å)
  • Bond lengths and angles within expected ranges
  • Rotamer outliers <2%
  • MolProbity score >2.0 (if available)
3

Biological Plausibility

  • Compare with homologous structures (RMSD <4Å for close homologs)
  • Validate active site geometry (if applicable)
  • Check oligomeric state makes biological sense
  • Verify disulfide bonds and metal coordination sites
  • Confirm low pLDDT regions match disorder predictions
4

Experimental Validation

  • Cross-reference with mutagenesis data
  • Validate XL-MS distance constraints (if available)
  • Compare Rg and Dmax with SAXS data (if available)
  • Check HDX protection patterns (if available)
  • Verify binding site predictions with biochemical data

Making Decisions: When to Trust Your Prediction

High Confidence

  • Mean pLDDT > 80
  • Model RMSD < 1.5Å
  • Ramachandran > 98% favored
  • Matches homolog (if available)
  • No serious geometry errors

Decision: Suitable for structure-based drug design, detailed mechanistic studies, and publication.

Medium Confidence

  • Mean pLDDT 60-80
  • Some model disagreement
  • Minor geometry issues
  • Partial match with homologs
  • Localized low confidence

Decision: Use with caution. Suitable for hypothesis generation, guiding experiments, avoiding high-confidence regions for critical analysis.

Low Confidence

  • Mean pLDDT < 60
  • High model disagreement
  • Ramachandran outliers
  • Poor homolog match
  • Widespread low pLDDT

Decision: Do not use for detailed analysis. May indicate intrinsic disorder, missing cofactors, or prediction failure. Consider alternative approaches.

Common Validation Pitfalls to Avoid

Pitfall 1: Ignoring Low-Confidence Regions

Just because >70% of your protein has high pLDDT doesn't mean you can ignore the rest. Low-confidence loops might connect critical domains or contain regulatory sites. Always investigate what these regions represent biologically.

Pitfall 2: Over-Interpreting Side Chain Positions

AlphaFold2 is excellent at backbone prediction but less reliable for side chains, especially in medium-confidence regions. Don't base mutagenesis strategies solely on predicted side chain orientations unless pLDDT > 90.

Pitfall 3: Assuming High pLDDT = Functionally Relevant

AlphaFold2 predicts the most stable conformation, but proteins are dynamic. High-confidence predictions might represent inactive states, and functional conformations might have lower confidence due to flexibility.

Pitfall 4: Neglecting Oligomeric State

Monomeric predictions of oligomeric proteins can be misleading. Functional interfaces might appear as "exposed" surfaces. Always consider biological assembly when interpreting predictions.

Pitfall 5: Confirmation Bias

Don't cherry-pick validation metrics that support your hypothesis. Apply the full checklist systematically, and report all results—including contradictory findings.

Conclusion: Validation as Scientific Practice

Protein structure prediction has become remarkably accurate, but validation remains a critical scientific practice. Think of AlphaFold2 and ESMFold as sophisticated hypotheses that must be tested against physical constraints, biological knowledge, and experimental data.

A well-validated structure—even with moderate confidence—is far more valuable than a poorly validated high-confidence prediction. By systematically applying this validation checklist, you ensure that your structural insights are built on solid foundations and can withstand scientific scrutiny.

Key Takeaway

Validation is not a box to check—it's an ongoing dialogue between prediction, experiment, and biological understanding. The most impactful structural biology comes from predictions that are validated thoughtfully, questioned rigorously, and interpreted with appropriate confidence.

Next Steps

  1. Download our validation checklist template to systematically assess your predictions
  2. Explore our structure viewer to visualize confidence metrics interactively
  3. Run your own predictions with AlphaFold2 on Protogen Bio with automated validation reports
  4. Join our community forum to discuss validation strategies and challenging cases