The 20 Amino Acids and Their Role in Protein Structures

Overview
Each of the 20 most common amino acids has its specific chemical characteristics and its unique role in protein structure and function. Based on the propensity of the side chains to be in contact with water (polar environment), amino acids are classified as polar (energetically favorable contact with water) or hydrophobic (energetically unfavorable contact with water). The third class are charged amino acids, which are also expected to be in a polar environment. Below we look at each of these classes and briefly discuss their role in protein structure and function.

Polar amino acids
When considering polarity, some of the amino acids are straightforward to define as polar, while in other cases we may encounter disagreements. For example, serine (Ser, S), threonine (Thr, T) and tyrosine (Tyr, Y) are clearly polar since they carry a hydroxylic (-OH) group. This polar group can participate in hydrogen bond formation with another polar group by donating or accepting a proton. Tyrosine is also implemented in metal binding in many functional sites. Asparagine (Asn, N) and glutamine (Gln, Q) are also polar and carry a polar amide group. Histidine (His, H), on the other hand, may be both polar and charged, depending on the environment and pH. It has two –NH group with a pKa value of around 6. When both groups are protonated, the side chain has a charge of +1. The pKa may be modulated by the protein environment in a way that the side chain may give away a proton and become neutral, or accept a proton, becoming charged. This ability makes histidine useful in enzyme active sites when proton extraction is required by the chemical reaction. The aromatic amino acids tryptophan (Trp, W), and the earlier mentioned Tyr, as well as the non-aromatic methionine (Met, M) are sometimes called amphipathic due to their ability to have both polar and nonpolar character. These residues can often be found close to the interface between a protein and solvent. We should note here that the side chains of histidine, tyrosine, the hydrophobic phenylalanine and tryptophan are also able to form weak hydrogen bonds of the types OH−π and CH−O, using electron clouds within their ring structures. For discussion of OH−π, and CH−O types of hydrogen bonds see Scheiner et al., 2002.

Hydrophobic amino acids
The hydrophobic amino acids include alanine (Ala, A), valine (Val, V), leucine (Leu, L), isoleucine (Ile, I), proline (Pro, P), phenylalanine (Phe, F) and cysteine (Cys, C). These residues are normally form the protein hydrophobic core, isolated from the polar solvent. The side chains within the core are tightly packed and participate in van der Waals interactions, which are essential for the stabilization of the structure. In addition, Cys residues are involved in three-dimensional structure stabilization through formation of disulfide (S-S) bridges, which sometimes connect different secondary structure elements, or different subunits in a complex. Another essential function of Cys is metal binding, sometimes in enzyme active sites, and sometimes in structure-stabilizing metal centers. Also, in the case of Cys some disagreement exists on its assignment to the hydrophobic group. For example, according to some schemes, it is found in the interior due to its highly reactivity (Fiser et al, 1996), while others consider it to be polar (REF). The image below showing buried/exposed fraction of amino acids within protein molecules suggests that Cys is rather exposed.

Charged amino acids
The charged amino acids at neutral pH (around 7.4) carry a single charge in the side chain. There are four of them, two basic, lysine (Lys, L) and arginine (Arg, R) with a positive charge at neutral pH, and two acidic, aspartate (Asp, D) and glutamate (Glu, E) carrying a negative charge at neutral pH. A so-called salt bridge is formed by the interaction between positively and negatively charged amino acid side chains. It has been been found to be important for the stabilization of protein three-dimensional structure, especially in proteins from thermophilic organisms (organisms that live at elevated temperatures, up to 80-90 C, or even higher, REF). Binding of positively charged metal ions is another function of the negatively charged carboxylic groups of Asp and Glu. Metalloproteins and the function of metal centers is a fascinating area of structural biology.

Glycine & proline
Glycine (Gly), being one of the common amino acids, does not have a side chain. It is often found at the surface of proteins, within loop- or coil (regions without defined secondary structure) regions, providing high flexibility to the polypeptide chain. This flexibility is required in sharp polypeptide turns in loop regions. Proline, although considered to be hydrophobic, is also found at the surface, presumably a result of its presence in turn and loop regions. In contrast to Gly that provides the polypeptide chain high flexibility, Pro provides rigidity by imposing certain torsion angles on the segment of the structure (Morgan & Rubenstein, 2013). The reason for this is that its side chain makes a covalent bond with the main chain, which constrains the phi-angle of the polypeptide in this location (see the section of the Ramachandran plot). Sometimes Pro is called helix breaker, since it is often found at the end of α-helices and in turns. The importance of Gly and Pro in protein folding has been discussed in (Krieger et al., 2005).

The 20 most common amino acids with their three- and one-letter codes


Charged amino acids (side chains often form salt bridges, in RED mnemonic rules to remember the one-letter code):
• Arginine - Arg - R
• Lysine - Lys - K
• Aspartic acid - Asp - D (AsparDic acid - hard t sound like D)
• Glutamic acid - Glu - E (GluEtamic acid - feels like it is almost an E there)

Polar amino acids (form hydrogen bonds as proton donors or acceptors):
• Glutamine - Gln - Q (Qlutamine - soft G)
• Asparagine - Asn - N (AsparagiNe - clear N)
• Histidine - His - H
• Serine - Ser - S
• Threonine - Thr - T
• Tyrosine - Tyr - Y
• Cysteine - Cys - C

Amphipathic amino acids (often found at the surface of proteins or lipid membranes, sometimes also classified as polar):
• Tryptophan - Trp - W (the largest side chain and the largest letter)
• Tyrosine - Tyr - Y
• Methionine - Met - M (may function as a ligand to metal ions)

Hydrophobic amino acids (normally buried inside the protein core):
• Alanine - Ala - A
• Isoleucine - Ile - I
• Leucine - Leu - L
• Methionine - Met - M
• Phenylalanine - Phe - F
• Valine - Val - V
• Proline - Pro - P
• Glycine - Gly - G

Structural biology services by SARomics Biostructures

X-ray crystallography services

X-ray Crystallography

protein nmr services

NMR spectroscopy

structure-based lead discovery services

Structure-based drug design

WAC fragment library screening services

Fragment library screening

The distribution of amino acids in proteins

The preferred location of different amino acids in protein molecules can be quantitatively characterized by calculating the extent by which an amino acid is buried in the structure or exposed to solvent. The image below provides an idea about the distribution of the different amino acids within protein molecules. While hydrophobic amino acids are mostly buried, a smaller fraction of polar groups are also found to be buried, while charged residues apparently are exposed to a much higher degree.
Buried/exposed fraction of amino acids within protein molecules.
The vertical axis shows the fraction of highly buried residues, while the horizontal axis shows the amino acid names in one-letter code.
Image from the
tutorial by J.E. Wampler,
A discussion on relation between surface accessibility and conservation of amino acids can be found in
Fiser et al, 1996.
Water & hydrogen bonds
A few words about hydrogen bonds. For a hydrogen bond to be formed, two electronegative atoms (for example in the case of an α-helix the amide N, and the carbonyl O) must interact with the same hydrogen. The hydrogen is covalently attached to one of the atoms (called the hydrogen bond donor) and interacts with the other (the hydrogen bond acceptor). In proteins essentially all groups capable of forming H-bonds (both main chain and side chain) are usually H-bonded to each other or to water molecules. Due to their electronic structure, water molecules may accept 2 hydrogen bonds, and donate 2, thus being simultaneously engaged in a total of 4 hydrogen bonds. Water molecules may also be involved in the stabilization of protein structures by making hydrogen bonds with the main chain and side chain groups in proteins and even linking different protein groups to each other. In addition, water is often found to be involved in ligand binding to proteins, mediating ligand interactions with polar or charged side chain- or main chain atoms. It is useful to remember that the energy of a hydrogen bond, depends on the distance between the donor and the acceptor and the angle between them, and is in the range of 2-10 kcal/mol. For more detailed information on hydrogen bonds I can recommend the review by Hubbard & Haider, 2010.