Judging the Quality of Macromolecular Models

A Glossary of Terms from Crystallography, NMR, and Homology Modeling

Please wait for the entire file to load before clicking links. It's a biggie.

Learn how to judge model quality! Use the excellent tutorial Model Validation, by Gerard Kleywegt.

Introduction

Crystallographic, NMR, homology, and other models of biomolecules are greeted by a research audience anxious to use these models to interpret results of their research on molecular function. With the ready availability of a thousands of models comes the need to understand how these models are obtained, and to be aware of the strengths and weaknesses of each method of "structure determination."

No one has ever seen a molecule. All models are painstaking interpretations of hard-won data. Each method of structure determination has its own criteria of progress, success, and final model quality. Often model quality varies from region to region within a model. Wise use of molecular models begins with awareness of each field's criteria of quality.

This document is a glossary of terms from macromolecular crystallography, NMR spectroscopy, and homology modeling. Understanding these terms can help you to assess the quality of the model you are using, and to make full use of all information obtained from structure determination. In the vivid model that floats before us on a computer screen, there may be more or less than meets the eye.

By its nature, a glossary contains definitions out of context. For a more complete discussion of all terms in this list, in the broader context of macromolecular structure determination, look up the terms in the index of a crystallography text and read all associated material. Most of these definitions are taken from Gale Rhodes's book, Crystallography Made Crystal Clear: A Guide for Users of Macromolecular Models (Third Edition, Academic Press, 2006). The CMCC Home Page, a web supplement for this book, contains links to sources of the book, and to many tools for those who explore macromolecular models.

HELP! My goal is to make this page a useful and accurate resource for users of macromolecular models. To that end, I invite criticisms and suggestions about this page from crystallographers, NMR spectroscopists, homology modelers, and theorists.

NOTE: You can obtain models shown in figures from the Protein Data Bank, using the four-character PDB codes provided in figure legends.

Glossary

Click any term.

Accuracy (homology modeling)
Asymmetric unit (crystallography)
B-factor (crystallography)
Completeness
Confidence factor (homology modeling)
Constraint (crystallography)
Correctness (homology modeling)
Data (crystallography)
Data (NMR)
Disordered regions
Electron-density map (crystallography)
Ensemble of models (NMR)
Experimental model
Free R-factor (crystallography)
Functional unit
Luzzati plot (crystallography)
Model versus Structure
Model B-factor (homology modeling)
Occupancy (crystallography)
Precision of atomic positions (crystallography)
R-factor (crystallography)
R-factor, real space (crystallography)
R_free (crystallography)
R_merge (crystallography)
R_symm (crystallography)
Ramachandran diagram
Reasonableness (homology modeling)
Real-space R factor (crystallography)
Real-space correlation coefficient (crystallography)
Redundancy (crystallography)
Refinement (crystallography)
Refinement (NMR)
Resolution (crystallography)
Reflections, number of (crystallography)
Reflections, unique (crystallography)
Restraint (crystallography)
Restraint (NMR)
Restraints per residue (NMR)
Rms deviations from average ensemble coordinate positions (NMR)
Rms deviations from ideal values (crystallography)
RSR (crystallography)
RSCC (crystallography)
Space group (crystallography)
Structure versus Model
Structurally averaged/energy minimized model (NMR)
Structural parameters
Temperature factor (crystallography)
Theoretical model
Threading energy (homology modeling)
Unexplained density (crystallography)
Unit cell dimensions (crystallography)
Water, number of (crystallography)
Which NMR model to use (NMR)

Accuracy (homology modeling)

The accuracy of a homology model refers to how well it fits the templates on which it was built. The rms deviation of a model from its templates should be very small in the core region. If not, we say that the model is inaccurate. An inaccurate model implies that the modeling process did not go well. Perhaps the modeling program simply could not come up with a model that aligns well with the coordinates of the templates. Perhaps during energy minimization, coordinates of the model drifted away from the template coordinates. Another possibility is poor choice of templates. For instance, occasionally a crystallographic model is distorted by crystal contacts, or an NMR model is distorted by the binding of a salt ion. If the homology modeler unwittingly uses such models as templates, energy refinement in the absence of the distorting effect will introduce inaccuracy, as defined here, while perhaps actually improving the model. A good rule of thumb is that if the templates share 30 - 50% homology with the target, rms differences between final positions of alpha carbons in the model and those of corresponding atoms in the templates should be less than 1.5 Å. But it is also essential to look at the template structures and make sure that they are really appropriate. For example, an NMR structure of an enzyme-cofactor complex is likely to be a poor model for a homologous enzyme in the absence of the cofactor.

The rms deviations only apply to corresponding atoms, and templates and targets often do not correspond well outside of core regions. Loop regions often cannot be included in assessment of accuracy because there is nothing to compare them to. In these regions, we should demand correctness, that is, the lack of unfavorable contacts or conformations. But beyond this kind of correctness, our criteria for loop accuracy are limited. If surface loops contain residues known to be important to function, we must proceed with great caution in using homology models to explain function.

Judging the Quality of Macromolecular Models

A Glossary of Terms from Crystallography, NMR, and Homology Modeling

Learn how to judge model quality! Use the excellent tutorial Model Validation, by Gerard Kleywegt.

Introduction

Glossary

Accuracy (homology modeling)

Asymmetric unit (crystallography)

Completeness, %

Confidence factor or "Model B-factor" (homology modeling)

Constraint (crystallography)

Correctness (homology modeling)

Data (crystallography)

Data (NMR)

Disordered regions (crystallography and NMR)

Electron-density map (crystallography)

Ensemble of models (NMR)

Experimental model

Free R-factor (crystallography)

Functional unit

Luzzati plot (crystallography)

Model versus structure

Occupancy (crystallography)

Precision of atomic positions (crystallography)

Precision of atomic positions (NMR)

R-factor (crystallography)

R-factor, real space, or RSR (crystallography)

Rmerge

Rsymm

Ramachandran diagram

Reasonableness (homology modeling)

Redundancy

Refinement (crystallography)

Refinement (NMR)

Reflections, unique

Reflections, number of

Resolution (crystallography)

Restraint (crystallography)

Restraint (NMR)

Restraints per residue (NMR)

Rms deviations (rmsd) from average ensemble coordinate positions (NMR)

Rms deviations (rmsd) from ideal values (crystallography)

Space Group

Structure vs. model

Structurally averaged/energy minimized model (NMR)

Structural parameters

Temperature factor (crystallography)

Theoretical model

Threading energy (homology modeling)

Unexplained density (crystallography)

Unit cell dimensions (crystallography)

Water, number of in model (crystallography)

Which model to use (NMR)

Summary

Molecular graphics produced with Swiss-PdbViewer.

Learn how to obtain and use Swiss-PdbViewer: SPdbV Tutorial.

R_merge

R_symm