Experimental Protein Structures: The Electron Map Density and Resolution

I deliberately use the term "experimental protein structure" to distinguish it from a "homology model”. It is important to keep in mind that the experimental structure may always be verified against the experimental data. A homology model may also be verified against experimental data, one could for example study the role of various residues in activity by introducing some mutations into the protein. This may provide answers to some questions, however, this method will not tell us much about the correctness of overall structure of the protein or about the presence of errors in some regions of the homology model.
Resolution
The primary factor determining the accuracy of an experimental protein structure is the resolution of the X-ray data. The term “resolution” essentially refers to the amount of information obtained from a crystal in a protein crystallography experiment, which for the given crystal lattice dimensions can roughly be described by the total number of unique diffraction intensities collected (unique because the same reflection or its symmetry-related mate may be measured several time during the experiment). The figure below gives an illustration of this principle: Two diffraction images from two different lysozyme crystals were collected in this case. One of the crystals (the left one) was rather small and did not diffract so well, so the diffraction spots disappear rather quickly, while moving from the center of the image towards its edges. On the other hand, the image on the right comes from a better crystal. In this case the diffraction spots continue much longer towards the edge of the image, by other words, it gives better resolution.


Assuming that the experimental data do not contain any major errors, then the structural model should be correct and accurate within the limits of accuracy of an experimental structure. Another important factor is of course the way we handle the structure, we can make mistakes while building the model into the electron density and refining it, but such mistakes may be revealed by the Protein Data Bank when the structure is submitted. There are several criteria that need to be considered in the process of assessing the quality of an experimental structure. Some of these criteria also apply to homology models. However, we need to remember that the homology model will never be better than the template, the experimental structure used to create it. For this reason, if in the process of template identification you are giving a choice of several potential template candidates, which are all different PDB entries of the same protein, you need to examine the quality of the template carefully before starting a homology modeling project. Since most of the protein structures in PDB have been determined by protein crystallography, we will focus our discussion on the quality parameters associated with X-ray structures:

  • Resolution
  • Refinement (the R-factor)
  • The B-factors (temperature factors)
  • Model geometry (bond distances, bond angles, Ramachandran plot)

Resolution
The primary factor determining the accuracy of an experimental protein structure is the resolution of the X-ray data. The term “resolution” essentially refers to the amount of information obtained from a crystal in a protein crystallography experiment, which for the given crystal lattice dimensions can roughly be described by the total number of unique diffraction intensities collected (unique because the same reflection or its symmetry-related mate may be measured several time during the experiment). The figure below gives an illustration of this principle: Two diffraction images from two different lysozyme crystals were collected in this case. One of the crystals (the left one) was rather small and did not diffract so well, so the diffraction spots disappear rather quickly, while moving from the center of the image towards its edges. On the other hand, the image on the right comes from a better crystal. In this case the diffraction spots continue much longer towards the edge of the image, by other words, it gives better resolution.

My Image

The difference in resolution will be reflected in the quality of the electron density maps, into which the model of the three-dimensional protein structure is built. A higher resolution of the data means higher resolution of the electron density maps, which in turn means higher accuracy of the positions of the atoms in the structure. This is illustrated in the following figure:

My Image

PDB ID 2H1W
2.6 Å resolution

My Image

PDB ID 2AC4
2.1 Å resolution

My Image

PDB ID 2H1W
1.2 Å resolution

The electron density in the figures above were generated using the Electron Density Server (EDS). The server provides a valuable opportunity to quickly examine the electron density of a PDB entry. This is especially important when some parts of the structure are not so well ordered (due to flexibility), which results in high temperature factors (B-factors) and poor electron density maps. It is important to be aware of such potential problems when choosing a template for homology modeling.

On the figures you can see residue Trp147 of
Bacilus subtilis ferrochelatase. The mesh represents the electron density into which the model was built, while the model is shown in a ball and sticks representation. The major difference between the three pictures is the resolution of the data. From left to right: 2.6 Å (low resolution), 2.1 Å (medium resolution) and 1.2 Å (high resolution). This illustrates how the electron density map for the side chain of Trp147 gets better and better resolved as we move from low to high resolution. For example, the hole in the center of the aromatic ring is visible at 1.2 Å resolution, but not at lower resolution. You may even notice the difference between the 2.1 Å and 2.6 Å maps: At 2.1 Å the shape of the side chain is much more pronounced than at 2.6 Å. The result is that at higher resolution the positions of the atoms (shown as spheres) are much better defined than at low resolution, resulting in a much more accurate structure. When trying to identify a template for homology modeling and finding out that there are several possible PDB entries, one would choose the structure with the highest resolution, since most probably it will be more accurate.