Amino acids are organic compounds that contain both amino and carboxylic acid functional groups.^[1] Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins.^[2] Only these 22 appear in the genetic code of life.^[3]^[4]

Amino acids can be classified according to the locations of the core structural functional groups (alpha- (α-), beta- (β-), gamma- (γ-) amino acids, etc.), other categories relate to polarity, ionization, and side chain group type (aliphatic, acyclic, aromatic, polar, etc.). In the form of proteins, amino acid residues form the second-largest component (water being the largest) of human muscles and other tissues.^[5] Beyond their role as residues in proteins, amino acids participate in a number of processes such as neurotransmitter transport and biosynthesis. It is thought that they played a key role in enabling life on Earth and its emergence.

Amino acids are formally named by the IUPAC-IUBMB Joint Commission on Biochemical Nomenclature in terms of the fictitious "neutral" structure shown in the illustration. For example, the systematic name of alanine is 2-aminopropanoic acid, based on the formula CH₃−CH(NH₂)−COOH. The Commission justified this approach as follows:^[6]

The systematic names and formulas given refer to hypothetical forms in which amino groups are unprotonated and carboxyl groups are undissociated. This convention is useful to avoid various nomenclatural problems but should not be taken to imply that these structures represent an appreciable fraction of the amino-acid molecules.

History

The first few amino acids were discovered in the early 1800s.^[7]^[8] In 1806, French chemists Louis-Nicolas Vauquelin and Pierre Jean Robiquet isolated a compound from asparagus that was subsequently named asparagine, the first amino acid to be discovered.^[9]^[10] Cystine was discovered in 1810,^[11] although its monomer, cysteine, remained undiscovered until 1884.^[12]^[10]^[a] Glycine and leucine were discovered in 1820.^[13] The last of the 20 common amino acids to be discovered was threonine in 1935 by William Cumming Rose, who also determined the essential amino acids and established the minimum daily requirements of all amino acids for optimal growth.^[14]^[15]

The unity of the chemical category was recognized by Wurtz in 1865, but he gave no particular name to it.^[16] The first use of the term "amino acid" in the English language dates from 1898,^[17] while the German term, Aminosäure, was used earlier.^[18] Proteins were found to yield amino acids after enzymatic digestion or acid hydrolysis. In 1902, Emil Fischer and Franz Hofmeister independently proposed that proteins are formed from many amino acids, whereby bonds are formed between the amino group of one amino acid with the carboxyl group of another, resulting in a linear structure that Fischer termed "peptide".^[19]

General structure

The 21 proteinogenic α-amino acids found in eukaryotes, grouped according to their side chains' pK_a values and charges carried at physiological pH (7.4)

2-, alpha-, or α-amino acids^[20] have the generic formula H₂NCHRCOOH in most cases,^[b] where R is an organic substituent known as a "side chain".^[21]

Of the many hundreds of described amino acids, 22 are proteinogenic ("protein-building").^[22]^[23]^[24] It is these 22 compounds that combine to give a vast array of peptides and proteins assembled by ribosomes.^[25] Non-proteinogenic or modified amino acids may arise from post-translational modification or during nonribosomal peptide synthesis.

Chirality

The carbon atom next to the carboxyl group is called the α–carbon. In proteinogenic amino acids, it bears the amine and the R group or side chain specific to each amino acid. With four distinct substituents, the α–carbon is stereogenic in all α-amino acids except glycine. All chiral proteogenic amino acids have the L configuration. They are "left-handed" enantiomers, which refers to the stereoisomers of the alpha carbon.

A few D-amino acids ("right-handed") have been found in nature, e.g., in bacterial envelopes, as a neuromodulator (D-serine), and in some antibiotics.^[26]^[27] Rarely, D-amino acid residues are found in proteins, and are converted from the L-amino acid as a post-translational modification.^[28]^[c]

Side chains

Polar charged side chains

Five amino acids possess a charge at neutral pH. Often these side chains appear at the surfaces on proteins to enable their solubility in water, and side chains with opposite charges form important electrostatic contacts called salt bridges that maintain structures within a single protein or between interfacing proteins.^[31] Many proteins bind metal into their structures specifically, and these interactions are commonly mediated by charged side chains such as aspartate, glutamate and histidine. Under certain conditions, each ion-forming group can be charged, forming double salts.^[32]

The two negatively charged amino acids at neutral pH are aspartate (Asp, D) and glutamate (Glu, E). The anionic carboxylate groups behave as Brønsted bases in most circumstances.^[31] Enzymes in very low pH environments, like the aspartic protease pepsin in mammalian stomachs, may have catalytic aspartate or glutamate residues that act as Brønsted acids.

Functional groups found in histidine (left), lysine (middle) and arginine (right)

There are three amino acids with side chains that are cations at neutral pH: arginine (Arg, R), lysine (Lys, K) and histidine (His, H). Arginine has a charged guanidino group and lysine a charged alkyl amino group, and are fully protonated at pH 7. Histidine's imidazole group has a pK_a of 6.0, and is only around 10 % protonated at neutral pH. Because histidine is easily found in its basic and conjugate acid forms it often participates in catalytic proton transfers in enzyme reactions.^[31]

Polar uncharged side chains

The polar, uncharged amino acids serine (Ser, S), threonine (Thr, T), asparagine (Asn, N) and glutamine (Gln, Q) readily form hydrogen bonds with water and other amino acids.^[31] They do not ionize in normal conditions, a prominent exception being the catalytic serine in serine proteases. This is an example of severe perturbation, and is not characteristic of serine residues in general. Threonine has two chiral centers, not only the L (2S) chiral center at the α-carbon shared by all amino acids apart from achiral glycine, but also (3R) at the β-carbon. The full stereochemical specification is (2S,3R)-L-threonine.

Hydrophobic side chains

Nonpolar amino acid interactions are the primary driving force behind the processes that fold proteins into their functional three dimensional structures.^[31] None of these amino acids' side chains ionize easily, and therefore do not have pK_as, with the exception of tyrosine (Tyr, Y). The hydroxyl of tyrosine can deprotonate at high pH forming the negatively charged phenolate. Because of this one could place tyrosine into the polar, uncharged amino acid category, but its very low solubility in water matches the characteristics of hydrophobic amino acids well.

Special case side chains

Several side chains are not described well by the charged, polar and hydrophobic categories. Glycine (Gly, G) could be considered a polar amino acid since its small size means that its solubility is largely determined by the amino and carboxylate groups. However, the lack of any side chain provides glycine with a unique flexibility among amino acids with large ramifications to protein folding.^[31] Cysteine (Cys, C) can also form hydrogen bonds readily, which would place it in the polar amino acid category, though it can often be found in protein structures forming covalent bonds, called disulphide bonds, with other cysteines. These bonds influence the folding and stability of proteins, and are essential in the formation of antibodies. Proline (Pro, P) has an alkyl side chain and could be considered hydrophobic, but because the side chain joins back onto the alpha amino group it becomes particularly inflexible when incorporated into proteins. Similar to glycine this influences protein structure in a way unique among amino acids. Selenocysteine (Sec, U) is a rare amino acid not directly encoded by DNA, but is incorporated into proteins via the ribosome. Selenocysteine has a lower redox potential compared to the similar cysteine, and participates in several unique enzymatic reactions.^[33] Pyrrolysine (Pyl, O) is another amino acid not encoded in DNA, but synthesized into protein by ribosomes.^[34] It is found in archaeal species where it participates in the catalytic activity of several methyltransferases.

β- and γ-amino acids

Amino acids with the structure NH+3−CXY−CXY−CO−2, such as β-alanine, a component of carnosine and a few other peptides, are β-amino acids. Ones with the structure NH+3−CXY−CXY−CXY−CO−2 are γ-amino acids, and so on, where X and Y are two substituents (one of which is normally H).^[6]

Zwitterions

Main article: Zwitterion

The common natural forms of amino acids have a zwitterionic structure, with −NH+3 (−NH+2− in the case of proline) and −CO−2 functional groups attached to the same C atom, and are thus α-amino acids, and are the only ones found in proteins during translation in the ribosome. In aqueous solution at pH close to neutrality, amino acids exist as zwitterions, i.e. as dipolar ions with both NH+3 and CO−2 in charged states, so the overall structure is NH+3−CHR−CO−2. At physiological pH the so-called "neutral forms" −NH₂−CHR−CO₂H are not present to any measurable degree.^[35] Although the two charges in the zwitterion structure add up to zero it is misleading to call a species with a net charge of zero "uncharged".

In strongly acidic conditions (pH below 3), the carboxylate group becomes protonated and the structure becomes an ammonio carboxylic acid, NH+3−CHR−CO₂H. This is relevant for enzymes like pepsin that are active in acidic environments such as the mammalian stomach and lysosomes, but does not significantly apply to intracellular enzymes. In highly basic conditions (pH greater than 10, not normally seen in physiological conditions), the ammonio group is deprotonated to give NH₂−CHR−CO−2.

Although various definitions of acids and bases are used in chemistry, the only one that is useful for chemistry in aqueous solution is that of Brønsted:^[36]^[37] an acid is a species that can donate a proton to another species, and a base is one that can accept a proton. This criterion is used to label the groups in the above illustration. The carboxylate side chains of aspartate and glutamate residues are the principal Brønsted bases in proteins. Likewise, lysine, tyrosine and cysteine will typically act as a Brønsted acid. Histidine under these conditions can act both as a Brønsted acid and a base.

Isoelectric point

For amino acids with uncharged side-chains the zwitterion predominates at pH values between the two pK_a values, but coexists in equilibrium with small amounts of net negative and net positive ions. At the midpoint between the two pK_a values, the trace amount of net negative and trace of net positive ions balance, so that average net charge of all forms present is zero.^[38] This pH is known as the isoelectric point pI, so pI = 1/2(pK_a1 + pK_a2).

For amino acids with charged side chains, the pK_a of the side chain is involved. Thus for aspartate or glutamate with negative side chains, the terminal amino group is essentially entirely in the charged form −NH+3, but this positive charge needs to be balanced by the state with just one C-terminal carboxylate group is negatively charged. This occurs halfway between the two carboxylate pK_a values: pI = 1/2(pK_a1 + pK_a(R)), where pK_a(R) is the side chain pK_a.^[37]

Similar considerations apply to other amino acids with ionizable side-chains, including not only glutamate (similar to aspartate), but also cysteine, histidine, lysine, tyrosine and arginine with positive side chains.

Amino acids have zero mobility in electrophoresis at their isoelectric point, although this behaviour is more usually exploited for peptides and proteins than single amino acids. Zwitterions have minimum solubility at their isoelectric point, and some amino acids (in particular, with nonpolar side chains) can be isolated by precipitation from water by adjusting the pH to the required isoelectric point.

Physicochemical properties

The 20 canonical amino acids can be classified according to their properties. Important factors are charge, hydrophilicity or hydrophobicity, size, and functional groups.^[27] These properties influence protein structure and protein–protein interactions. The water-soluble proteins tend to have their hydrophobic residues (Leu, Ile, Val, Phe, and Trp) buried in the middle of the protein, whereas hydrophilic side chains are exposed to the aqueous solvent. (In biochemistry, a residue refers to a specific monomer within the polymeric chain of a polysaccharide, protein or nucleic acid.) The integral membrane proteins tend to have outer rings of exposed hydrophobic amino acids that anchor them in the lipid bilayer. Some peripheral membrane proteins have a patch of hydrophobic amino acids on their surface that sticks to the membrane. In a similar fashion, proteins that have to bind to positively charged molecules have surfaces rich in negatively charged amino acids such as glutamate and aspartate, while proteins binding to negatively charged molecules have surfaces rich in positively charged amino acids like lysine and arginine. For example, lysine and arginine are present in large amounts in the low-complexity regions of nucleic-acid binding proteins.^[39] There are various hydrophobicity scales of amino acid residues.^[40]

Some amino acids have special properties. Cysteine can form covalent disulfide bonds to other cysteine residues. Proline forms a cycle to the polypeptide backbone, and glycine is more flexible than other amino acids.

Glycine and proline are strongly present within low complexity regions of both eukaryotic and prokaryotic proteins, whereas the opposite is the case with cysteine, phenylalanine, tryptophan, methionine, valine, leucine, isoleucine, which are highly reactive, or complex, or hydrophobic.^[39]^[41]^[42]

Many proteins undergo a range of posttranslational modifications, whereby additional chemical groups are attached to the amino acid residue side chains sometimes producing lipoproteins (that are hydrophobic),^[43] or glycoproteins (that are hydrophilic)^[44] allowing the protein to attach temporarily to a membrane. For example, a signaling protein can attach and then detach from a cell membrane, because it contains cysteine residues that can have the fatty acid palmitic acid added to them and subsequently removed.^[45]

Table of standard amino acid abbreviations and properties

"Amino acid code" redirects here. For base-pair encoding of amino acids, see Genetic code § Codons.

Main article: Proteinogenic amino acid

Although one-letter symbols are included in the table, IUPAC–IUBMB recommend^[6] that "Use of the one-letter symbols should be restricted to the comparison of long sequences".

The one-letter notation was chosen by IUPAC-IUB based on the following rules:^[46]

Initial letters are used where there is no ambuiguity: C cysteine, H histidine, I isoleucine, M methionine, S serine, V valine,^[46]
Where arbitrary assignment is needed, the structurally simpler amino acids are given precedence: A Alanine, G glycine, L leucine, P proline, T threonine,^[46]
F PHenylalanine and R aRginine are assigned by being phonetically suggestive,^[46]
W tryptophane is assigned based on the double ring being visually suggestive to the bulky letter W,^[46]
K lysine and Y tyrosine are assigned as alphabetically nearest to their initials L and T (note that U was avoided for its similarity with V, while X was reserved for undetermined or atypical amino acids); for tyrosine the mnemonic tYrosine was also proposed,^[47]
D aspartate was assigned arbitrarily, with the proposed mnemonic asparDic acid;^[48] E glutamate was assigned in alphabetical sequence being larger by merely one methylene –CH2– group,^[47]
N asparagine was assigned arbitrarily, with the proposed mnemonic asparagiNe;^[48] Q glutamine was assigned in alphabetical sequence of those still available (note again that O was avoided due to similarity with D), with the proposed mnemonic Qlutamine.^[48]

Amino acid	3- and 1-letter symbols		Side chain			Hydropathy index^[49]	Molar absorptivity^[50]		Molecular mass	Abundance in proteins (%)^[51]	Standard genetic coding, IUPAC notation
Amino acid	3	1	Class	Chemical polarity^[52]	Net charge at pH 7.4^[52]	Hydropathy index^[49]	Wavelength, λ_max (nm)	Coefficient ε (mM⁻¹·cm⁻¹)	Molecular mass	Abundance in proteins (%)^[51]	Standard genetic coding, IUPAC notation
Alanine	Ala	A	Aliphatic	Nonpolar	Neutral	1.8			89.094	8.76	GCN
Arginine	Arg	R	Fixed cation	Basic polar	Positive	−4.5			174.203	5.78	MGR, CGY^[53]
Asparagine	Asn	N	Amide	Polar	Neutral	−3.5			132.119	3.93	AAY
Aspartate	Asp	D	Anion	Brønsted base	Negative	−3.5			133.104	5.49	GAY
Cysteine	Cys	C	Thiol	Brønsted acid	Neutral	2.5	250	0.3	121.154	1.38	UGY
Glutamine	Gln	Q	Amide	Polar	Neutral	−3.5			146.146	3.9	CAR
Glutamate	Glu	E	Anion	Brønsted base	Negative	−3.5			147.131	6.32	GAR
Glycine	Gly	G	Aliphatic	Nonpolar	Neutral	−0.4			75.067	7.03	GGN
Histidine	His	H	Cationic	Brønsted acid and base	Positive, 10% Neutral, 90%	−3.2	211	5.9	155.156	2.26	CAY
Isoleucine	Ile	I	Aliphatic	Nonpolar	Neutral	4.5			131.175	5.49	AUH
Leucine	Leu	L	Aliphatic	Nonpolar	Neutral	3.8			131.175	9.68	YUR, CUY^[54]
Lysine	Lys	K	Cation	Brønsted acid	Positive	−3.9			146.189	5.19	AAR
Methionine	Met	M	Thioether	Nonpolar	Neutral	1.9			149.208	2.32	AUG
Phenylalanine	Phe	F	Aromatic	Nonpolar	Neutral	2.8	257, 206, 188	0.2, 9.3, 60.0	165.192	3.87	UUY
Proline	Pro	P	Cyclic	Nonpolar	Neutral	−1.6			115.132	5.02	CCN
Serine	Ser	S	Hydroxylic	Polar	Neutral	−0.8			105.093	7.14	UCN, AGY
Threonine	Thr	T	Hydroxylic	Polar	Neutral	−0.7			119.119	5.53	ACN
Tryptophan	Trp	W	Aromatic	Nonpolar	Neutral	−0.9	280, 219	5.6, 47.0	204.228	1.25	UGG
Tyrosine	Tyr	Y	Aromatic	Brønsted acid	Neutral	−1.3	274, 222, 193	1.4, 8.0, 48.0	181.191	2.91	UAY
Valine	Val	V	Aliphatic	Nonpolar	Neutral	4.2			117.148	6.73	GUN

Two additional amino acids are in some species coded for by codons that are usually interpreted as stop codons:

21st and 22nd amino acids	3-letter	1-letter	Molecular mass
Selenocysteine	Sec	U	168.064
Pyrrolysine	Pyl	O	255.313

In addition to the specific amino acid codes, placeholders are used in cases where chemical or crystallographic analysis of a peptide or protein cannot conclusively determine the identity of a residue. They are also used to summarize conserved protein sequence motifs. The use of single letters to indicate sets of similar residues is similar to the use of abbreviation codes for degenerate bases.^[55]^[56]

Ambiguous amino acids	3-letter	1-letter	Amino acids included	Codons included
Any / unknown	Xaa	X	All	NNN
Asparagine or aspartate	Asx	B	D, N	RAY
Glutamine or glutamate	Glx	Z	E, Q	SAR
Leucine or isoleucine	Xle	J	I, L	YTR, ATH, CTY^[57]
Hydrophobic		Φ	V, I, L, F, W, Y, M	NTN, TAY, TGG
Aromatic		Ω	F, W, Y, H	YWY, TTY, TGG^[58]
Aliphatic (non-aromatic)		Ψ	V, I, L, M	VTN, TTR^[59]
Small		π	P, G, A, S	BCN, RGY, GGR
Hydrophilic		ζ	S, T, H, N, Q, E, D, K, R	VAN, WCN, CGN, AGY^[60]
Positively-charged		+	K, R, H	ARR, CRY, CGR
Negatively-charged		−	D, E	GAN

Unk is sometimes used instead of Xaa, but is less standard.

Ter or * (from termination) is used in notation for mutations in proteins when a stop codon occurs. It corresponds to no amino acid at all.^[61]

In addition, many nonstandard amino acids have a specific code. For example, several peptide drugs, such as Bortezomib and MG132, are artificially synthesized and retain their protecting groups, which have specific codes. Bortezomib is Pyz–Phe–boroLeu, and MG132 is Z–Leu–Leu–Leu–al. To aid in the analysis of protein structure, photo-reactive amino acid analogs are available. These include photoleucine (pLeu) and photomethionine (pMet).^[62]

Occurrence and functions in biochemistry

A polypeptide is an unbranched chain of amino acids.

β-Alanine and its α-alanine isomer

The amino acid selenocysteine

Proteinogenic amino acids

Main article: Proteinogenic amino acid

Amino acids are the precursors to proteins.^[25] They join by condensation reactions to form short polymer chains called peptides or longer chains called either polypeptides or proteins. These chains are linear and unbranched, with each amino acid residue within the chain attached to two neighboring amino acids. In nature, the process of making proteins encoded by RNA genetic material is called translation and involves the step-by-step addition of amino acids to a growing protein chain by a ribozyme that is called a ribosome.^[63] The order in which the amino acids are added is read through the genetic code from an mRNA template, which is an RNA derived from one of the organism's genes.

Twenty-two amino acids are naturally incorporated into polypeptides and are called proteinogenic or natural amino acids.^[27] Of these, 20 are encoded by the universal genetic code. The remaining 2, selenocysteine and pyrrolysine, are incorporated into proteins by unique synthetic mechanisms. Selenocysteine is incorporated when the mRNA being translated includes a SECIS element, which causes the UGA codon to encode selenocysteine instead of a stop codon.^[64] Pyrrolysine is used by some methanogenic archaea in enzymes that they use to produce methane. It is coded for with the codon UAG, which is normally a stop codon in other organisms.^[65]

Several independent evolutionary studies have suggested that Gly, Ala, Asp, Val, Ser, Pro, Glu, Leu, Thr may belong to a group of amino acids that constituted the early genetic code, whereas Cys, Met, Tyr, Trp, His, Phe may belong to a group of amino acids that constituted later additions of the genetic code.^[66]^[67]^[68]

Standard vs nonstandard amino acids

The 20 amino acids that are encoded directly by the codons of the universal genetic code are called standard or canonical amino acids. A modified form of methionine (N-formylmethionine) is often incorporated in place of methionine as the initial amino acid of proteins in bacteria, mitochondria and plastids (including chloroplasts). Other amino acids are called nonstandard or non-canonical. Most of the nonstandard amino acids are also non-proteinogenic (i.e. they cannot be incorporated into proteins during translation), but two of them are proteinogenic, as they can be incorporated translationally into proteins by exploiting information not encoded in the universal genetic code.

The two nonstandard proteinogenic amino acids are selenocysteine (present in many non-eukaryotes as well as most eukaryotes, but not coded directly by DNA) and pyrrolysine (found only in some archaea and at least one bacterium). The incorporation of these nonstandard amino acids is rare. For example, 25 human proteins include selenocysteine in their primary structure,^[69] and the structurally characterized enzymes (selenoenzymes) employ selenocysteine as the catalytic moiety in their active sites.^[70] Pyrrolysine and selenocysteine are encoded via variant codons. For example, selenocysteine is encoded by stop codon and SECIS element.^[71]^[72]^[73]

N-formylmethionine (which is often the initial amino acid of proteins in bacteria, mitochondria, and chloroplasts) is generally considered as a form of methionine rather than as a separate proteinogenic amino acid. Codon–tRNA combinations not found in nature can also be used to "expand" the genetic code and form novel proteins known as alloproteins incorporating non-proteinogenic amino acids.^[74]^[75]^[76]

Non-proteinogenic amino acids

Main article: Non-proteinogenic amino acids

Aside from the 22 proteinogenic amino acids, many non-proteinogenic amino acids are known. Those either are not found in proteins (for example carnitine, GABA, levothyroxine) or are not produced directly and in isolation by standard cellular machinery. For example, hydroxyproline , is synthesised from proline. Another example is selenomethionine).

Non-proteinogenic amino acids that are found in proteins are formed by post-translational modification. Such modifications can also determine the localization of the protein, e.g., the addition of long hydrophobic groups can cause a protein to bind to a phospholipid membrane.^[77] Examples:

the carboxylation of glutamate allows for better binding of calcium cations,^[78]
Hydroxyproline, generated by hydroxylation of proline, is a major component of the connective tissue collagen.^[79]
Hypusine in the translation initiation factor EIF5A, contains a modification of lysine.^[80]

Some non-proteinogenic amino acids are not found in proteins. Examples include 2-aminoisobutyric acid and the neurotransmitter gamma-aminobutyric acid. Non-proteinogenic amino acids often occur as intermediates in the metabolic pathways for standard amino acids – for example, ornithine and citrulline occur in the urea cycle, part of amino acid catabolism (see below).^[81] A rare exception to the dominance of α-amino acids in biology is the β-amino acid beta alanine (3-aminopropanoic acid), which is used in plants and microorganisms in the synthesis of pantothenic acid (vitamin B₅), a component of coenzyme A.^[82]

In mammalian nutrition

Main article: Essential amino acid

Further information: Protein (nutrient) and Amino acid synthesis

Amino acids are not typical component of food: animals eat proteins. The protein is broken down into amino acids in the process of digestion. They are then used to synthesize new proteins, other biomolecules, or are oxidized to urea and carbon dioxide as a source of energy.^[83] The oxidation pathway starts with the removal of the amino group by a transaminase; the amino group is then fed into the urea cycle. The other product of transamidation is a keto acid that enters the citric acid cycle.^[84] Glucogenic amino acids can also be converted into glucose, through gluconeogenesis.^[85]

Of the 20 standard amino acids, nine (His, Ile, Leu, Lys, Met, Phe, Thr, Trp and Val) are called essential amino acids because the human body cannot synthesize them from other compounds at the level needed for normal growth, so they must be obtained from food.^[86]^[87]^[88]

Semi-essential and conditionally essential amino acids, and juvenile requirements

In addition, cysteine, tyrosine, and arginine are considered semiessential amino acids, and taurine a semi-essential aminosulfonic acid in children. Some amino acids are conditionally essential for certain ages or medical conditions. Essential amino acids may also vary from species to species.^[d] The metabolic pathways that synthesize these monomers are not fully developed.^[89]^[90]

Non-protein functions

Biosynthetic pathways for catecholamines and trace amines in the human brain^[91]^[92]^[93]

N-Methylphenethylamine

primary
pathway

brain
CYP2D6

minor
pathway

COMT

DBH

Catecholamines and trace amines are synthesized from phenylalanine and tyrosine in humans.

Further information: Amino acid neurotransmitter

Many proteinogenic and non-proteinogenic amino acids have biological functions beyond being precursors to proteins and peptides.In humans, amino acids also have important roles in diverse biosynthetic pathways. Defenses against herbivores in plants sometimes employ amino acids.^[94] Examples:

Standard amino acids

Tryptophan is a precursor of the neurotransmitter serotonin.^[95]
Tyrosine (and its precursor phenylalanine) are precursors of the catecholamine neurotransmitters dopamine, epinephrine and norepinephrine and various trace amines.
Phenylalanine is a precursor of phenethylamine and tyrosine in humans. In plants, it is a precursor of various phenylpropanoids, which are important in plant metabolism.
Glycine is a precursor of porphyrins such as heme.^[96]
Arginine is a precursor of nitric oxide.^[97]
Ornithine and S-adenosylmethionine are precursors of polyamines.^[98]
Aspartate, glycine, and glutamine are precursors of nucleotides.^[99] However, not all of the functions of other abundant nonstandard amino acids are known.

Roles for nonstandard amino acids

Carnitine is used in lipid transport.
gamma-aminobutyric acid is a neurotransmitter.^[100]
5-HTP (5-hydroxytryptophan) is used for experimental treatment of depression.^[101]
L-DOPA (L-dihydroxyphenylalanine) for Parkinson's treatment,^[102]
Eflornithine inhibits ornithine decarboxylase and used in the treatment of sleeping sickness.^[103]
Canavanine, an analogue of arginine found in many legumes is an antifeedant, protecting the plant from predators.^[104]
Mimosine found in some legumes, is another possible antifeedant.^[105] This compound is an analogue of tyrosine and can poison animals that graze on these plants.

Uses in industry

Animal feed

Amino acids are sometimes added to animal feed because some of the components of these feeds, such as soybeans, have low levels of some of the essential amino acids, especially of lysine, methionine, threonine, and tryptophan.^[106] Likewise amino acids are used to chelate metal cations in order to improve the absorption of minerals from feed supplements.^[107]

Food

The food industry is a major consumer of amino acids, especially glutamic acid, which is used as a flavor enhancer,^[108] and aspartame (aspartylphenylalanine 1-methyl ester), which is used as an artificial sweetener.^[109] Amino acids are sometimes added to food by manufacturers to alleviate symptoms of mineral deficiencies, such as anemia, by improving mineral absorption and reducing negative side effects from inorganic mineral supplementation.^[110]

Chemical building blocks

Further information: Asymmetric synthesis

Amino acids are low-cost feedstocks used in chiral pool synthesis as enantiomerically pure building blocks.^[111]^[112]

Amino acids are used in the synthesis of some cosmetics.^[106]

Aspirational uses

Fertilizer

The chelating ability of amino acids is sometimes used in fertilizers to facilitate the delivery of minerals to plants in order to correct mineral deficiencies, such as iron chlorosis. These fertilizers are also used to prevent deficiencies from occurring and to improve the overall health of the plants.^[113]

Biodegradable plastics

Further information: Biodegradable plastic and Biopolymer

Amino acids have been considered as components of biodegradable polymers, which have applications as environmentally friendly packaging and in medicine in drug delivery and the construction of prosthetic implants.^[114] An interesting example of such materials is polyaspartate, a water-soluble biodegradable polymer that may have applications in disposable diapers and agriculture.^[115] Due to its solubility and ability to chelate metal ions, polyaspartate is also being used as a biodegradable antiscaling agent and a corrosion inhibitor.^[116]^[117]

Synthesis

Main article: Amino acid synthesis

Chemical synthesis

The commercial production of amino acids usually relies on mutant bacteria that overproduce individual amino acids using glucose as a carbon source. Some amino acids are produced by enzymatic conversions of synthetic intermediates. 2-Aminothiazoline-4-carboxylic acid is an intermediate in one industrial synthesis of L-cysteine for example. Aspartic acid is produced by the addition of ammonia to fumarate using a lyase.^[110]

Biosynthesis

In plants, nitrogen is first assimilated into organic compounds in the form of glutamate, formed from alpha-ketoglutarate and ammonia in the mitochondrion. For other amino acids, plants use transaminases to move the amino group from glutamate to another alpha-keto acid. For example, aspartate aminotransferase converts glutamate and oxaloacetate to alpha-ketoglutarate and aspartate.^[118] Other organisms use transaminases for amino acid synthesis, too.

Nonstandard amino acids are usually formed through modifications to standard amino acids. For example, homocysteine is formed through the transsulfuration pathway or by the demethylation of methionine via the intermediate metabolite S-adenosylmethionine,^[119] while hydroxyproline is made by a post translational modification of proline.^[120]

Microorganisms and plants synthesize many uncommon amino acids. For example, some microbes make 2-aminoisobutyric acid and lanthionine, which is a sulfide-bridged derivative of alanine. Both of these amino acids are found in peptidic lantibiotics such as alamethicin.^[121] However, in plants, 1-aminocyclopropane-1-carboxylic acid is a small disubstituted cyclic amino acid that is an intermediate in the production of the plant hormone ethylene.^[122]

Primordial synthesis

The formation of amino acids and peptides are assumed to precede and perhaps induce the emergence of life on earth. Amino acids can form from simple precursors under various conditions.^[123] Surface-based chemical metabolism of amino acids and very small compounds may have led to the build-up of amino acids, coenzymes and phosphate-based small carbon molecules.^[124]^{[additional citation(s) needed]} Amino acids and similar building blocks could have been elaborated into proto-peptides, with peptides being considered key players in the origin of life.^[125]

In the famous Urey-Miller experiment, the passage of an electric arc through a mixture of methane, hydrogen, and ammonia produces a large number of amino acids. Since then, scientists have discovered a range of ways and components by which the potentially prebiotic formation and chemical evolution of peptides may have occurred, such as condensing agents, the design of self-replicating peptides and a number of non-enzymatic mechanisms by which amino acids could have emerged and elaborated into peptides.^[125] Several hypotheses invoke the Strecker synthesis whereby hydrogen cyanide, simple aldehydes, ammonia, and water produce amino acids.^[123]

According to a review, amino acids, and even peptides, "turn up fairly regularly in the various experimental broths that have been allowed to be cooked from simple chemicals. This is because nucleotides are far more difficult to synthesize chemically than amino acids." For a chronological order, it suggests that there must have been a 'protein world' or at least a 'polypeptide world', possibly later followed by the 'RNA world' and the 'DNA world'.^[126] Codon–amino acids mappings may be the biological information system at the primordial origin of life on Earth.^[127] While amino acids and consequently simple peptides must have formed under different experimentally probed geochemical scenarios, the transition from an abiotic world to the first life forms is to a large extent still unresolved.^[128]

Reactions

Amino acids undergo the reactions expected of the constituent functional groups.^[129]^[130]

Peptide bond formation

See also: Peptide synthesis and Peptide bond

Two amino acids are shown next to each other. One loses a hydrogen and oxygen from its carboxyl group (COOH) and the other loses a hydrogen from its amino group (NH2). This reaction produces a molecule of water (H2O) and two amino acids joined by a peptide bond (–CO–NH–). The two joined amino acids are called a dipeptide. — The condensation of two amino acids to form a dipeptide. The two amino acid *residues* are linked through a *peptide bond*.

As both the amine and carboxylic acid groups of amino acids can react to form amide bonds, one amino acid molecule can react with another and become joined through an amide linkage. This polymerization of amino acids is what creates proteins. This condensation reaction yields the newly formed peptide bond and a molecule of water. In cells, this reaction does not occur directly; instead, the amino acid is first activated by attachment to a transfer RNA molecule through an ester bond. This aminoacyl-tRNA is produced in an ATP-dependent reaction carried out by an aminoacyl tRNA synthetase.^[131] This aminoacyl-tRNA is then a substrate for the ribosome, which catalyzes the attack of the amino group of the elongating protein chain on the ester bond.^[132] As a result of this mechanism, all proteins made by ribosomes are synthesized starting at their N-terminus and moving toward their C-terminus.

However, not all peptide bonds are formed in this way. In a few cases, peptides are synthesized by specific enzymes. For example, the tripeptide glutathione is an essential part of the defenses of cells against oxidative stress. This peptide is synthesized in two steps from free amino acids.^[133] In the first step, gamma-glutamylcysteine synthetase condenses cysteine and glutamate through a peptide bond formed between the side chain carboxyl of the glutamate (the gamma carbon of this side chain) and the amino group of the cysteine. This dipeptide is then condensed with glycine by glutathione synthetase to form glutathione.^[134]

In chemistry, peptides are synthesized by a variety of reactions. One of the most-used in solid-phase peptide synthesis uses the aromatic oxime derivatives of amino acids as activated units. These are added in sequence onto the growing peptide chain, which is attached to a solid resin support.^[135] Libraries of peptides are used in drug discovery through high-throughput screening.^[136]

The combination of functional groups allow amino acids to be effective polydentate ligands for metal–amino acid chelates.^[137] The multiple side chains of amino acids can also undergo chemical reactions.

Catabolism

Degradation of an amino acid often involves deamination by moving its amino group to α-ketoglutarate, forming glutamate. This process involves transaminases, often the same as those used in amination during synthesis. In many vertebrates, the amino group is then removed through the urea cycle and is excreted in the form of urea. However, amino acid degradation can produce uric acid or ammonia instead. For example, serine dehydratase converts serine to pyruvate and ammonia.^[99] After removal of one or more amino groups, the remainder of the molecule can sometimes be used to synthesize new amino acids, or it can be used for energy by entering glycolysis or the citric acid cycle, as detailed in image at right.

Complexation

Amino acids are bidentate ligands, forming transition metal amino acid complexes.^[139]

Chemical analysis

The total nitrogen content of organic matter is mainly formed by the amino groups in proteins. The Total Kjeldahl Nitrogen (TKN) is a measure of nitrogen widely used in the analysis of (waste) water, soil, food, feed and organic matter in general. As the name suggests, the Kjeldahl method is applied. More sensitive methods are available.^[140]^[141]

Notes

References

External links

Encoded (proteinogenic) amino acids

General topics

By properties

Aliphatic	Branched-chain amino acids (Valine Isoleucine Leucine) Methionine Alanine Proline Glycine
Aromatic	Histidine Tyrosine Tryptophan Phenylalanine
Polar, uncharged	Asparagine Glutamine Serine Threonine
Positive charge (pK_a)	Lysine (≈10.8) Arginine (≈12.5) Histidine (≈6.1) Pyrrolysine
Negative charge (pK_a)	Aspartic acid (≈3.9) Glutamic acid (≈4.1) Selenocysteine (≈5.4) Cysteine (≈8.3) Tyrosine (≈10.1)

Amino acids types: Encoded (proteins)
Essential
Non-proteinogenic
Ketogenic
Glucogenic
Secondary amino
Imino acids
D-amino acids
Dehydroamino acids

Chemical bonds

Intramolecular
(strong)

Covalent	Electron deficiency 3c–2e 4c–2e 8c–2e Hypervalence 3c–4e Agostic Bent Coordinate (dipolar) Pi backbond Metal–ligand multiple bond Charge-shift Hapticity Conjugation Hyperconjugation Aromaticity homo bicyclo
Metallic	Metal aromaticity
Ionic

Intermolecular
(weak)

Van der Waals forces	London dispersion
Hydrogen	Low-barrier Resonance-assisted Symmetric Dihydrogen bonds C–H···O interaction
Noncovalent other	Mechanical Halogen Chalcogen Metallophilic (aurophilic) Intercalation Stacking Cation–pi Anion–pi Salt bridge

Bond cleavage

Electron counting rules

Protein primary structure and posttranslational modifications

General

N terminus

C terminus

Single specific AAs

Serine/Threonine	Phosphorylation Dephosphorylation Glycosylation O-GlcNAc ADP-ribosylation
Tyrosine	Phosphorylation Dephosphorylation ADP-ribosylation Sulfation Porphyrin ring linkage Adenylylation Flavin linkage Topaquinone (TPQ) formation Detyrosination
Cysteine	Palmitoylation Prenylation
Aspartate	Succinimide formation ADP-ribosylation
Glutamate	Carboxylation ADP-ribosylation Methylation Polyglutamylation Polyglycylation
Asparagine	Deamidation Glycosylation
Glutamine	Transglutamination
Lysine	Methylation Acetylation Acylation Adenylylation Hydroxylation Ubiquitination Sumoylation ADP-ribosylation Deamination Oxidative deamination to aldehyde O-glycosylation Imine formation Glycation Carbamylation Succinylation Lactylation Propionylation Butyrylation
Arginine	Citrullination Methylation ADP-ribosylation
Proline	Hydroxylation
Histidine	Diphthamide formation Adenylylation
Tryptophan	C-mannosylation

Crosslinks between two AAs

Cysteine–Cysteine	Disulfide bond ADP-ribosylation
Methionine–Hydroxylysine	Sulfilimine bond
Lysine–Tyrosine	Lysine tyrosylquinone (LTQ) formation
Tryptophan–Tryptophan	Tryptophan tryptophylquinone (TTQ) formation

Crosslinks between three AAs

Serine–Tyrosine–Glycine	p-Hydroxybenzylidene-imidazolinone (HBI) formation (chromophore)
Histidine–Tyrosine–Glycine	4-(p-hydroxybenzylidene)-5-imidazolinone (HBI) formation (chromophore)
Alanine–Serine–Glycine	Methylidene-imidazolone (MIO) formation

Crosslinks between four AAs

Allysine–Allysine–Allysine–Lysine	Desmosine

Metabolism: Protein metabolism, synthesis and catabolism enzymes

Essential amino acids are in Capitals

K→acetyl-CoA

LYSINE→	Saccharopine dehydrogenase Glutaryl-CoA dehydrogenase
LEUCINE→	3-Hydroxybutyryl-CoA dehydrogenase Branched-chain amino acid aminotransferase Branched-chain alpha-keto acid dehydrogenase complex Enoyl-CoA hydratase HMG-CoA lyase HMG-CoA reductase Isovaleryl coenzyme A dehydrogenase α-Ketoisocaproate dioxygenase Leucine 2,3-aminomutase Methylcrotonyl-CoA carboxylase Methylglutaconyl-CoA hydratase (See Template:Leucine metabolism in humans – this diagram does not include the pathway for β-leucine synthesis via leucine 2,3-aminomutase)
TRYPTOPHAN→	Indoleamine 2,3-dioxygenase/Tryptophan 2,3-dioxygenase Arylformamidase Kynureninase 3-hydroxyanthranilate oxidase Aminocarboxymuconate-semialdehyde decarboxylase Aminomuconate-semialdehyde dehydrogenase
PHENYLALANINE→tyrosine→	(see below)

G

G→pyruvate
→citrate

glycine→serine→	Serine hydroxymethyltransferase Serine dehydratase glycine→creatine: Guanidinoacetate N-methyltransferase Creatine kinase
alanine→	Alanine transaminase
cysteine→	D-cysteine desulfhydrase
threonine→	L-threonine dehydrogenase

G→glutamate→
α-ketoglutarate

HISTIDINE→	Histidine ammonia-lyase Urocanate hydratase Formiminotransferase cyclodeaminase
proline→	Proline oxidase Pyrroline-5-carboxylate reductase 1-Pyrroline-5-carboxylate dehydrogenase/ALDH4A1 PYCR1
arginine→	Ornithine aminotransferase Ornithine decarboxylase Agmatinase
→alpha-ketoglutarate→TCA	Glutamate dehydrogenase
Other	cysteine+glutamate→glutathione: Gamma-glutamylcysteine synthetase Glutathione synthetase Gamma-glutamyl transpeptidase glutamate→glutamine: Glutamine synthetase Glutaminase

G→propionyl-CoA→
succinyl-CoA

VALINE→	Branched-chain amino acid aminotransferase Branched-chain alpha-keto acid dehydrogenase complex Enoyl-CoA hydratase 3-hydroxyisobutyryl-CoA hydrolase 3-hydroxyisobutyrate dehydrogenase Methylmalonate semialdehyde dehydrogenase
ISOLEUCINE→	Branched-chain amino acid aminotransferase Branched-chain alpha-keto acid dehydrogenase complex 3-hydroxy-2-methylbutyryl-CoA dehydrogenase
METHIONINE→	generation of homocysteine: Methionine adenosyltransferase Adenosylhomocysteinase regeneration of methionine: Methionine synthase/Homocysteine methyltransferase Betaine-homocysteine methyltransferase conversion to cysteine: Cystathionine beta synthase Cystathionine gamma-lyase
THREONINE→	Threonine aldolase
→succinyl-CoA→TCA	Propionyl-CoA carboxylase Methylmalonyl CoA epimerase Methylmalonyl-CoA mutase

G→fumarate

PHENYLALANINE→tyrosine→	Phenylalanine hydroxylase Tyrosine aminotransferase 4-Hydroxyphenylpyruvate dioxygenase Homogentisate 1,2-dioxygenase Fumarylacetoacetate hydrolase tyrosine→melanin: Tyrosinase

G→oxaloacetate

asparagine→aspartate→	Asparaginase/Asparagine synthetase Aspartate transaminase

Authority control databases

Authority control databases
National	France BnF data Germany Israel United States Japan Czech Republic
Other	Encyclopedia of Modern Ukraine