The 20 Amino Acids and Their Role in Protein Structures

Each of the 20 most common amino acids has its specific chemical characteristics and its unique role in protein structure and function. For example, based on the propensity of the side chain to be in contact with water, amino acids can be classified as hydrophobic (low propensity to be in contact with water), polar and charged (energetically favorable contact with water). The charged amino acids are easy to assign, they include two basic residues, lysine (Lys) and arginine (Arg) both having positive charge at neutral pH values, and two acidic, aspartate (Asp) and glutamate (Glu) both carrying negative charge at neutral pH. On the other hand, polarity is not always straightforward to assign. Serine (Ser) and threonine (Thr) are polar since both carry a hydroxyl group, asparagine (Asn) and glutamine (Gln) carry a polar amide group. However, histidine (His) may be both polar and charged, depending on the environment and pH of the solution. It has two –NH group with a pKa value of around 6. When both groups are protonated, the side chain has a charge of +1. However, the pKa may be modulated by the environment inside the protein in a way that the side chain may give away a proton and become neutral, or accept a proton, becoming charged. This ability makes histidine useful within enzyme active sites. Other amino acids − the aromatic tyrosine (Tyr) and tryptophan (Trp) and the non-aromatic methionine (Met) are often called amphipathic due to their ability to have both polar and non-polar character. These residues are often found close to the surface of proteins. The –OH group of tyrosine is able both to donate and accept a hydrogen bond. In addition, the side chains of histidine, tyrosine, phenylalanine and tryptophan are also able to form weak hydrogen bonds of the types, OH−π, and CH−O, by other words using electron clouds within their ring structures. For discussion of OH−π, and CH−O type of hydrogen bonds see: Scheiner et al., 2002.

The hydrophobic amino acids include alanine (Ala, A), valine (Val, V), leucine (Leu, L), isoleucine (Ile, I), proline (Pro, P), phenylalanine (Phe, F) and cysteine (Cys). Although even in this case there are some disagreements on the classification. For example, according to some classification schemes, Cys is considered to be hydrophobic, while others consider it to be polar since it is often found close to or at the surface of proteins. Often two Cys residues connect together different parts of a structure, or even different domains/subunits by forming disulfide (S-S) bridges. All hydrophobic residues participate in van der Waals interactions and contribute to the stabilization of the protein core.

Glycine (Gly), being one of the common amino acids, does not have a side chain. Generally, glycine is often found at the surface of proteins, within loop- or coil (without defined secondary structure) regions, providing high flexibility to the polypeptide chain at these locations. This suggests that it is rather hydrophilic. Proline, on the other hand, is generally non-polar and has properties opposite to those of Gly, it provides rigidity to the polypeptide chain by imposing certain torsion angles on the segment of the structure. The reason for this is discussed in the section on torsion angles. Glycine and proline are often highly conserved within a protein family since they are essential for the conservation of a particular protein fold.

Below the 20 most common amino acids in proteins are listed with their three-letter and one-letter codes:

Charged (side chains often form salt bridges):
• Arginine - Arg - R
• Lysine - Lys - K
• Aspartic acid - Asp - D
• Glutamic acid - Glu - E

Polar (form hydrogen bonds as proton donors or acceptors):
• Glutamine - Gln - Q
• Asparagine - Asn - N
• Histidine - His - H
• Serine - Ser - S
• Threonine - Thr - T
• Tyrosine - Tyr - Y
• Cysteine - Cys - C

Amphipathic (often found at the surface of proteins or lipid membranes, sometimes also classified as polar):
• Tryptophan - Trp - W
• Tyrosine - Tyr - Y
• Methionine - Met - M (may function as a ligand to metal ions)

Hydrophobic (normally buried inside the protein core):
• Alanine - Ala - A
• Isoleucine - Ile - I
• Leucine - Leu - L
• Methionine - Met - M
• Phenylalanine - Phe - F
• Valine - Val - V
• Proline - Pro - P
• Glycine - Gly - G

The preferred location of different amino acids in protein molecules can be quantitatively characterized by calculating the extent by which an amino acid is buried in the structure or exposed to solvent. Below you can see an image showing the distribution of the different amino acids within protein molecules. While hydrophobic amino acids are mostly buried within the core, a smaller fraction of polar groups are found to be buried and charged residues are exposed to solvent to a much higher degree.

Fraction of buried amino acids

The vertical axis shows the fraction of highly buried residues, while the horizontal axis shows the amino acid names in one-letter code. Image from the tutorial by J.E. Wampler,

For a hydrogen bond to be formed, two electronegative atoms (for example in the case of an alpha-helix the amide N, and the carbonyl O) have to interact with the same hydrogen. The hydrogen is covalently attached to one of the atoms (called the hydrogen-bond donor), but interacts electrostatically with the other atom (the hydrogen bond acceptor, O). In proteins essentially all groups capable of forming H-bonds (both main chain and side chain, independently of whether the residues are within a secondary structure or some other type of structure) are usually H-bonded to each other or to water molecules. Due to their electronic structure, water molecules may accept 2 hydrogen bonds, and donate 2, thus being simultaneously engaged in a total of 4 hydrogen bonds. Water molecules may also be involved in the stabilization of protein structures by making hydrogen bonds with the main chain and side chain groups in proteins and even linking different protein groups to each other. In addition, water is often found to be involved in ligand binding to proteins, mediating ligand interactions with polar or charged side chain- or main chain atoms. It is useful to remember that the energy of a hydrogen bond, depending on the distance between the donor and the acceptor and the angle between them, is in the range of 2-10 kcal/mol.

Salt bridges formed by positively and negatively charged amino acids have also been found to be important for the stabilization of protein three-dimensional structure - for example proteins from thermophilic organisms (organisms that live at elevated temperatures, up to 80-90 C, or even higher) often have an extensive network of salt bridges on their surface, which contributes to the thermostability of these proteins, preventing their denaturation at high temperatures.

Being able to recognize the properties of the different amino acids is a valuable skill when
making sequence alignments - in this case functional and even structural information can be extracted from the analysis of the conservation pattern within an alignment.