Experimental Protein Structures: Diffraction, Electron Density and Resolution

I use the term "experimental protein structure" to distinguish it from a "model”, which often refers to a homology model built using an experimental structure as a template. It is important to keep in mind that the experimental structure is always verified against experimental data. A homology model may also be verified against experimental data, one could for example study the role of various residues in activity by introducing some mutations into the protein. This may provide answers to some questions, however, this method will not provide accurate positions for the atoms of the structure.

The primary factor determining the accuracy of an experimental protein structure is the resolution of the X-ray data. The term “resolution” essentially refers to the amount of information obtained from a crystal in a protein crystallography experiment, which for the given crystal lattice dimensions can roughly be described by the total number of unique diffraction intensities collected (unique because the same reflection or its symmetry-related mate may be measured several time during the experiment). The figure below gives an illustration of this principle: Two diffraction images from two different lysozyme crystals were collected in this case. One of the crystals (the left one) was rather small and did not diffract so well, so the diffraction spots disappear rather quickly, while moving from the center of the image towards its edges. On the other hand, the image on the right comes from a better crystal. In this case the diffraction spots continue much longer towards the edge of the image, by other words, it gives better resolution.

Assuming that the experimental data do not contain any major errors, then the structural model should be correct and accurate within the limits of accuracy of an experimental structure. Another important factor is of course the way we handle the structure, people can make mistakes while building the model into the electron density and refining it. Fortunately, such mistakes are often revealed by the Protein Data Bank when the structure is submitted. When assessing the quality of a structure, several criteria need to be considered. Some of these may also be applied to homology model quality assessment. In this case we also need to remember that a homology model will never be better than the template, the experimental structure used to create it. For this reason, if in the process of template identification we are giving a choice of several potential template candidates (different PDB entries of the same protein), we need to examine the quality of the template carefully before starting a homology modeling project. Since most of the protein structures in the PDB have been determined by protein crystallography, we will focus our discussion on the quality parameters associated with X-ray structures:

  • Resolution
  • Refinement (the R-factor)
  • The B-factors (temperature factors)
  • Model geometry (bond distances, bond angles, Ramachandran plot)

The image below shows two diffraction images of the same protein at different resolution:

My Image

Higher resolution essentially means more information to calculate the electron density map. In the image above the higher resolution diffraction on the right has larger number of spots (reflections) extending towards the edge of the image.
Higher resolution will provide higher quality electron density maps, helping to position the model atoms with higher accuracy during model building. This is illustrated in the following figure:

My Image

2.6 Å resolution

My Image

2.1 Å resolution

My Image

1.2 Å resolution

The electron density in the figures above was generated using the Electron Density Server (EDS). The server provides a valuable opportunity to quickly examine the electron density of a PDB entry. This is especially important when some parts of the structure are not so well ordered (due to flexibility), resulting in high temperature factors (B-factors) and poor electron density. It is important to be aware of such potential problems when choosing a template for homology modeling.

On the figures you can see residue Trp147 of
Bacilus subtilis ferrochelatase. The mesh represents the electron density into which the model was built, while the model is shown in a ball and sticks representation. The major difference between the three pictures is the resolution of the data. From left to right: 2.6 Å (low resolution), 2.1 Å (medium resolution) and 1.2 Å (high resolution). This illustrates how the electron density map for the side chain of Trp147 gets better and better resolved as we move from low to high resolution. For example, the hole in the center of the aromatic ring is visible at 1.2 Å resolution, but not at lower resolution. You may even notice the difference between the 2.1 Å and 2.6 Å maps: At 2.1 Å the shape of the side chain is much more pronounced than at 2.6 Å. The result is that at higher resolution the positions of the atoms (shown as spheres) are much better defined than at low resolution, resulting in a much more accurate structure. When trying to identify a template for homology modeling and finding out that there are several possible PDB entries, one would choose the structure with the highest resolution, since most probably it will be more accurate.