X-ray Crystallography & Protein Structure Determination

Historical perspective
The method of X-ray crystallography originates from the discovery of X-rays by Wilhelm Conrad Röntgen, and the subsequent developments by Max von Laue, who was first to observe diffraction of X-rays that revealed the electromagnetic wave nature of X-rays. These discoveries were followed by the experiments by the Brags (father and son), who showed that X-ray diffraction could be used in the determination of the atomic structure of matter. Both Max von Laue & the Brags family received Nobel Prizes, in 1914 & 1915, respectively.

Although X-ray crystallography found its way to the study of the atomic and molecular structure of matter, the world had to wait for additional 45 years before the first X-ray structure of a protein was determined by crystallography. This was the structure of myoglobin, which gave the authors, Max Perutz and John Kendrew the Chemistry Nobel Prize in 1962. Since then many other protein X-ray structures have been awarded the Nobel Prize. Probably the most spectacular was the structure pf the ribosome, for which Ada Yonath, Venki Ramakrishnan & Thomas A. Steitz, received the Nobel Prize in Chemistry in 2009.

The structural revolution of the 1990ties and around 2000 resulted in an explosion in the number of X-ray structures in the Protein Data Bank. The growth of PDB content has been impressive with the current number of structures approaching 200 000. One of the direct consequences of this growth was the creation of AlphaFold, an AI system developed by DeepMind for the prediction a protein structures from their amino acid sequence. The latest release of the database includes more than 200 million predicted structures for nearly all catalogued proteins known to science. This will undoubtedly boost structural biology to levels unseen until now and will dramatically increase our understanding of biological processes.
What will happen in the AlphaFold age?
My guess is that a lot of old biochemical data will see new interpretation, many oligomeric complexes will be easier to reconstruct from low-resolution SAXS and EM data. Shortly, many new discoveries are to be made by studying this huge volume of new structures. Does this mean the end of X-ray crystallography? The answer is no, many applications, for example drug discovery, require much higher structure accuracy that what AlfaFold can currently provide. For example, AlfaFold cannot provide accurate predictions on protein-ligand interactions, required in drug discovery projects. I also guess that dynamical behaviour of proteins and protein complexes still needs experimental studies.

In any case, at this very moment in Autumn of 2022 we are entering a new era of biological research. And I think that structural biology is preparing many exciting surprises for us in the coming years!
diffraction by a protein crystal

X-ray diffraction by a protein crystal

electro density of triptophane side chain

A side chain of Trp built into a high resolution electron density map.

An outline of X-ray crystallography

X-ray diffraction
As mentioned above, Max von Laue demonstrated that X-rays are electromagnetic waves, having the same nature as visible light or radio waves. The only difference of course is the very short wavelength of around 1-1.5 Å (Ångström, which is 10-10 meter) that is characteristic for X-rays. For comparison, the wavelength of visible light is between 400 to 700 nm (one nm is 10 Å).

X-ray diffraction is caused by the interaction of electromagnetic waves with the atoms inside the crystals, and particularly with the electrons. The waves get scattered by the electrons, each electron becomes a small X-ray source of its own. Scattered waves from all the electrons within each atom are added to each other, giving diffracted waves from each atom, etc. When the scattered waves are added, they may either get stronger or cancel each other. Those which get stronger are registered by the X-ray detector, as in the figure above. Interestingly, we do not necessarily need X-rays to observe interference, we can, for example go to a lake nearby, through two stones into the water and then observe how the waves from the two stones either reinforce each other or become weaker. There are many demonstrations of wave addition on the web, one of them can be found here.

X-ray data collection, electron density calculation and model building
X-rays may be generated using various laboratory X-ray sources or at synchrotrons, where very high intensity and highly focused X-rays can be generated. There are several synchrotrons around the world with station adapted for collecting X-ray data from protein crystals. Depending on the type of the crystal (crystals may have different cell dimensions and different symmetry groups), different strategies for data collection are followed and a different amount of data is required. Usually the crystal is rotated in the X-ray beam one degree a time, and exposed to X-rays for a short period (seconds to minutes) until a full data set is collected. Depending on the intensity of the X-ray source, the size of the crystal and how well it diffracts, the total time required for data collection may be very different. The data are processed using specific computer programs

Each spot on the image is a diffracted X-ray beam, which emerged from the crystal and was registered by the X-ray detector. Thousands of diffraction spots need to be collected from a protein crystal in order to get a complete data set. The X-ray data are processed and the intensities of the diffraction spots are extracted and used to calculate an electron density map of the molecules inside the crystal. The electron density, in turn, will tell us where the atoms are located, information which can be used to build a model of the molecule in the crystal.

This is a very short summary of protein X-ray crystallography. I hope to provide more details on this in the near future.