Protein Crystallography: X-ray Diffraction and Data Collection

Historical outline
The method of protein crystallography originates from the discovery of X-rays by Wilhelm Conrad Röntgen, and the subsequent developments by Max von Laue, who was first to observe diffraction of X-rays to reveal the wave nature of X-rays. These discoveries were followed by the experiments by the Brags (father and son), who showed that X-ray diffraction could be used in the determination of the atomic structure of matter. However, the world had to wait for additional 45 years before the first protein structure was determined by protein crystallography. This was the structure of myoglobin, which gave the authors, Max Perutz and John Kendrew the Chemistry Nobel Prize in 1962. Since then several other protein crystallographic structures have been awarded the Nobel Prize. Among these is the prize awarded to Dorothy Hodgkin for the structures of vitamin B12 and insulin (Chemistry Prize of 1964); Johann Deisenhofer, Robert Huber and Hartmut Michel for the determination of the structure of the first membrane protein, the photosynthetic reaction center (Chemistry Prize of 1988); John E Walker for his role in the determination of the structure of ATP synthase (Chemistry Prize of 1997). Recent prizes related to protein crystallography include those awarded to Peter Agre & Roderick MacKinnon (Chemistry Prize of 2003), Roger Kornberg (Chemistry Prize of 2006), Venki Ramakrishnan, Thomas A. Steitz, Ada Yonath for the elucidation of the ternary structure of the ribosome (Chemistry Prize of 2009), and recently Brian Kobilka and Robert Lefkowitz for functional and structural studies of GPCR proteins (Chemistry Prize, 2012).

Protein X-ray crystallography and NMR spectroscopy are currently the only two methods, which provide atomic resolution tertiary protein structures. Although, with around 140 000 entries in the Protein Data Bank (PDB, Feb 2018), of which the majority was determined by diffraction methods, one could say that the method dominates the field of structural biology. The use of protein structure information is currently widely spread within many areas of science and industry, among which are biotechnology and pharmaceutical industry.

My Image

1. X-ray diffraction
As mentioned above, X-rays are electromagnetic waves of the same nature as visible light or radio waves, the only difference being the very short wavelength of around 1 Å (Ångström, which is 10-10 meter). For comparison, the wavelength of visible light is approximately between 400 and 700 nm (one nm is 10 Å). X-rays may be generated using various laboratory sources or at synchrotrons, where very high intensity and highly focused X-rays can be generated. To obtain X-ray data from a crystal, it needs to be placed in a monochromatic (single wavelength) X-ray beam. Subsequently, it is repeatedly exposed to the X-ray beam, while changing its orientation (usually rotating). Each exposure provides an image, similar to that shown above. Each spot on the image is a diffracted X-ray beam, which emerged from the crystal and was registered by the X-ray detector. Thousands of diffraction spots need to be collected to solve a protein structure. Depending on the type of the crystal (cell dimensions and symmetry), different strategies for data collection are followed and a different amount of data is collected. Usually the crystal is rotated in the X-ray beam one degree a time, and exposed to X-rays for a short period (seconds to minutes, depending on the intensity of the X-ray source). The intensities of these spots are subsequently used to calculate the electron density of the molecules within the crystal. The electron density, in turn, will tell us where the atoms are located, information which can be used to build a model of the molecule or molecules in the crystal.

Using crystallographic terminology, this process is called X-ray data collection. When the X-rays hit the crystal, a phenomenon called X-ray diffraction takes place. Diffraction is a common physical phenomenon and occurs when a wave (of any nature) encounters an obstacle, which can be any material object. This results in bending of the wave around that object, also called scattering of waves. Another way for diffraction to occur is when a wave encounters a small opening, a small hole or a slit. This causes spreading of the wave in all directions. In practice, in both cases, the obstacle and the hole/slit start to act as a new wave source, sending around waves with slightly different direction of propagation, as compared to the original wave. The "new" scattered waves interact with each other, resulting in another physical phenomena called interference, which translated to normal language simply means addition of waves.

X-ray diffraction is caused by the interaction of electromagnetic waves with the matter inside the crystals, and particularly with the electrons. These waves get scattered by the electrons, or each electron becomes a small X-ray source of its own. Scattered waves from all the electrons within each atom are added to each other, giving diffracted waves from each atom, etc. When the scattered waves are added, they may either get stronger or cancel each other. Those which get stronger are registered by the X-ray detector, as in the figure above. Interestingly, we do not necessarily need X-rays to observe interference, we can, for example go to a lake nearby, through two stones into the water and then observe how the waves from the two stones either reinforce each other or become weaker. There are many demonstrations of wave addition on the web, one of them can be found

Diffraction data processing
Next step in a protein crystallography project after diffraction data collection, is the processing of the data, which is aimed at extracting the relative intensities of the diffracted X-ray beam. Several different computer programs exist and are used for the purpose. Among these are Mosflm, part of the CCP4 package, XDS and HKL-2000.
At some moment in the future I will try to add a discussion of the experimental techniques used in X-ray data collection.

SARomics Biostructures provides structural biology and drug discovery services for the biotech industry and academic groups. The services include protein crystallization, X-ray and NMR spectroscopy structure determination, fragment library screening, hit-to-lead optimization, higher order structure testing and characterization, etc.