amino acids in protein structures

The 20 Amino Acids and Their Role in Protein Structures

Each of the 20 most common amino acids has its specific chemical characteristics and its unique role in protein structure and function. For example, based on the propensity of the side chain to be in contact with water, amino acids can be classified as hydrophobic (low propensity to be in contact with water), polar and charged (energetically favorable contacts with water).

Charged amino acids
It is easy to see which amino acids are charged simply because at neutral pH (around 7) they contain a single charge. There are four of them, two basic amino acids, lysine (Lys) and arginine (Arg) with a positive charge at neutral pH, and two acidic, aspartate (Asp) and glutamate (Glu) carrying a negative charge at neutral pH. The so-called salt bridges, which are formed by the interaction between positively and negatively charged amino acid side chains, have been found to be important for the stabilization of protein three-dimensional structure. For example, charged amino acids in proteins from thermophilic organisms (organisms that live at elevated temperatures, up to 80-90 C, or even higher) often form an extensive network of salt bridges on the surface of these proteins, contributing to their thermostability and preventing denaturation at high temperatures. Binding of metal ions in proteins is another function of the negatively charged carboxylic groups of Asp and Glu. Metalloproteins and metal binding is a fascinating area of structural biology. I hope that at some moment in the future I will complement this compendium by a chapter on metal binding in proteins.

Polar amino acids
When considering polarity, some of the amino acids are straightforward to assign, while in other cases we may encounter disagreements. For example, serine (Ser), threonine (Thr) and tyrosine (Tyr) are clearly polar since they carry a hydroxyl (-OH) group. This polar group can participate in hydrogen bond formation with another polar group by donating or accepting a proton. Asparagine (Asn) and glutamine (Gln) are also polar, they carry a polar amide group. Histidine (His), on the other hand, may be both polar and charged, depending on the environment and pH. It has two –NH group with a pKa value of around 6. When both groups are protonated, the side chain has a charge of +1. The pKa may be modulated by the protein environment in a way that the side chain may give away a proton and become neutral, or accept a proton, becoming charged. This ability makes histidine useful in enzyme active sites when proton extraction is required by the chemical reaction. The aromatic amino acids tryptophan (Trp), and the earlier mentioned Tyr, as well as the non-aromatic methionine (Met) are sometimes called amphipathic due to their ability to have both polar and nonpolar character. These residues can be found close to the interface between a protein and solvent. We should note here that the side chains of histidine, tyrosine, phenylalanine and tryptophan are also able to form weak hydrogen bonds of the types OH−π and CH−O, using electron clouds within their ring structures. For discussion of OH−π, and CH−O types of hydrogen bonds see
Scheiner et al., 2002.

Hydrophobic amino acids
The hydrophobic amino acids include alanine (Ala, A), valine (Val, V), leucine (Leu, L), isoleucine (Ile, I), proline (Pro, P), phenylalanine (Phe, F) and cysteine (Cys). These residues are normally located inside the protein core, isolated from solvent. They participate in van der Waals interactions, which are essential for the stabilization of protein structures. In addition, Cys residues are involved in three-dimensional structure stabilization through formation of disulfide (S-S) bridges, which may connect different parts of a protein structure, or even different subunits in a complex. We should note here that also in the case of Cys some disagreement exist on its assignment to the hydrophobic group. For example, according to some schemes, it is hydrophobic, while others consider it to be polar since it is often found close to, or at the surface of proteins.

Glycine & proline
Glycine (Gly), being one of the common amino acids, does not have a side chain. It is often found at the surface of proteins, within loop- or coil (without defined secondary structure) regions, providing high flexibility to the polypeptide chain. This flexibility is required to facilitate sharp polypeptide turns in loop regions. Proline, on the other hand, is generally nonpolar and has properties opposite to those of Gly, it provides rigidity to the polypeptide chain by imposing certain torsion angles on the segment of the structure (Morgan & Rubenstein, 2013). The reason for this is that its side chain makes a covalent bond with the main chain, which constrains the phi-angle of the polypeptide in this location (see the section of the Ramachandran plot). Sometimes Pro is called helix breaker, since it is often found at the end of α-helices and in turns. The importance of Gly and Pro in protein folding has been discussed in (Krieger et al., 2005).

SARomics Biostructures structural biology services


NMR spectroscopy

Structure-based drug design

Fragment library screening

Below the 20 most common amino acids in proteins are listed with their three-letter and one-letter codes:

Charged (side chains often form salt bridges):
• Arginine - Arg - R
• Lysine - Lys - K
• Aspartic acid - Asp - D
• Glutamic acid - Glu - E

Polar (form hydrogen bonds as proton donors or acceptors):
• Glutamine - Gln - Q
• Asparagine - Asn - N
• Histidine - His - H
• Serine - Ser - S
• Threonine - Thr - T
• Tyrosine - Tyr - Y
• Cysteine - Cys - C

Amphipathic (often found at the surface of proteins or lipid membranes, sometimes also classified as polar):
• Tryptophan - Trp - W
• Tyrosine - Tyr - Y
• Methionine - Met - M (may function as a ligand to metal ions)

Hydrophobic (normally buried inside the protein core):
• Alanine - Ala - A
• Isoleucine - Ile - I
• Leucine - Leu - L
• Methionine - Met - M
• Phenylalanine - Phe - F
• Valine - Val - V
• Proline - Pro - P
• Glycine - Gly - G

Distribution of amino acids in proteins
The preferred location of different amino acids in protein molecules can be quantitatively characterized by calculating the extent by which an amino acid is buried in the structure or exposed to solvent. The image below provides an idea about the distribution of the different amino acids within protein molecules. While hydrophobic amino acids are mostly buried, a smaller fraction of polar groups are also found to be buried, while charged residues apparently are exposed to a much higher degree.

Fraction of buried amino acids

Buried/exposed fraction of amino acids within protein molecules.
The vertical axis shows the fraction of highly buried residues, while the horizontal axis shows the amino acid names in one-letter code.
Image from the
tutorial by J.E. Wampler,

Water & hydrogen bonds
For a hydrogen bond to be formed, two electronegative atoms (for example in the case of an α-helix the amide N, and the carbonyl O) must interact with the same hydrogen. The hydrogen is covalently attached to one of the atoms (called the hydrogen bond donor) and interacts with the other (the hydrogen bond acceptor). In proteins essentially all groups capable of forming H-bonds (both main chain and side chain) are usually H-bonded to each other or to water molecules. Due to their electronic structure, water molecules may accept 2 hydrogen bonds, and donate 2, thus being simultaneously engaged in a total of 4 hydrogen bonds. Water molecules may also be involved in the stabilization of protein structures by making hydrogen bonds with the main chain and side chain groups in proteins and even linking different protein groups to each other. In addition, water is often found to be involved in ligand binding to proteins, mediating ligand interactions with polar or charged side chain- or main chain atoms. It is useful to remember that the energy of a hydrogen bond, depends on the distance between the donor and the acceptor and the angle between them, and is in the range of 2-10 kcal/mol. For more detailed information on hydrogen bonds I can recommend the review by
Hubbard & Haider, 2010.