Technical Note

How Structures Are "Determined" By Homology Modeling

If you know just a little bit about where homology models come from, you'll be better equipped to use them wisely. This page provides that little bit.

You can make homology models yourself (see ADVANCED TUTORIALS in the Contents frame), or you can request them from servers like SwissModel or ModBase. In between these two extremes, there are intermediate options in which you request the model, but use your own choice of templates, or your own alignment of the target sequence with specified templates. Whether done by you or by a server, homology modeling entails these steps:

  1. Obtain a sequence of the protein to be modeled (the target, a protein whose structure you would like to know), usually from a genome-sequence data bank like NCBI or ExPASy.
  2. Search PDB or ExPDB (using such sequence-search tools as pBlast) for experimental (X-ray or NMR) models with sufficient sequence similarity to the target. These models are called templates. A homology model can be based on one template, but results are likely to be better if there are several templates. Different templates might fit the model sequence better in different regions.
  3. Superimpose the templates three-dimensionally, usually by some kind of least-squares procedure.
  4. Based on the best three-dimensional superposition, obtain a high quality alignment of the template sequences, which is called a structural alignment.
  5. Align the target sequence with the structural alignment of the templates.
  6. Based on this alignment, thread the target sequence onto the templates to produce a raw homology model, in which target conformations of sequence regions are identical with the consensus conformation of aligned regions in templates.
  7. Search databases of protein loops to build parts that do not match well with any of the templates, and build loops similar to those in databases, if possible.
  8. Build remaining poorly aligning loops with reasonable conformations; these loops are primarily guesswork.
  9. Optimize the model by minimizing its energy, using a tool like GROMOS.
  10. Annotate the model by assigning numbers according to the level of agreement of the model with the template(s). Low numbers mean good agreement of final model with a template; high numbers mean that the region did not fit a template well. Enter these numbers into the B-factor column of the file (PDB-format) of the new model. In DeepView, the command Color: B-Factor will reveal this annotation, with blue color signifying good agreeement with templates, and red indicating regions of the model that deviate significantly from any of the templates, and so were modeled by loop building or simple given reasonable conformations.

Homology models are not experimental models, in the sense that X-ray diffraction and NMR models are. Homology models are based on no direct experimental evidence, but merely on the assumption that similar sequences have similar conformations. Homology modeling provides a rough glimpse of structure where experimental methods have not yet succeeded.

For a good brief discussion of the meaning of such terms as correctness, accuracy, and error in homology models, click HERE. The linked page is a chapter in an old but still useful broad introduction to protein models.

Back to Tutorial.