N-linked glycosylation is the attachment of an oligosaccharide, a carbohydrate consisting of several sugar molecules, sometimes also referred to as glycan, to a nitrogen atom (the amide nitrogen of an asparagine (Asn) residue of a protein), in a process called N-glycosylation, studied in biochemistry.^[1] The resulting protein is called an N-linked glycan, or simply an N-glycan.

This type of linkage is important for both the structure^[2] and function^[3] of many eukaryotic proteins. The N-linked glycosylation process occurs in eukaryotes and widely in archaea, but very rarely in bacteria. The nature of N-linked glycans attached to a glycoprotein is determined by the protein and the cell in which it is expressed.^[4] It also varies across species. Different species synthesize different types of N-linked glycan.

Energetics of bond formation

There are two types of bonds involved in a glycoprotein: bonds between the saccharides residues in the glycan and the linkage between the glycan chain and the protein molecule.

The sugar moieties are linked to one another in the glycan chain via glycosidic bonds. These bonds are typically formed between carbons 1 and 4 of the sugar molecules. The formation of glycosidic bond is energetically unfavourable, therefore the reaction is coupled to the hydrolysis of two ATP molecules.^[4]

On the other hand, the attachment of a glycan residue to a protein requires the recognition of a consensus sequence. N-linked glycans are almost always attached to the nitrogen atom of an asparagine (Asn) side chain that is present as a part of Asn–X–Ser/Thr consensus sequence, where X is any amino acid except proline (Pro).^[4]

In animal cells, the glycan attached to the asparagine is almost inevitably N-acetylglucosamine (GlcNAc) in the β-configuration.^[4] This β-linkage is similar to glycosidic bond between the sugar moieties in the glycan structure as described above. Instead of being attached to a sugar hydroxyl group, the anomeric carbon atom is attached to an amide nitrogen. The energy required for this linkage comes from the hydrolysis of a pyrophosphate molecule.^[4]

Biosynthesis

The biosynthesis of N-linked glycans occurs via three major steps:^[4]

Synthesis of dolichol-linked precursor oligosaccharide
En bloc transfer of precursor oligosaccharide to protein
Processing of the oligosaccharide

Synthesis, en bloc transfer and initial trimming of precursor oligosaccharide occurs in the endoplasmic reticulum (ER). Subsequent processing and modification of the oligosaccharide chain are carried out in the Golgi apparatus.

The synthesis of glycoproteins is thus spatially separated in different cellular compartments. Therefore, the type of N-glycan synthesized, depends on its accessibility to the different enzymes present within these cellular compartments.

However, in spite of the diversity, all N-glycans are synthesized through a common pathway with a common core glycan structure.^[4] The core glycan structure is essentially made up of two N-acetyl glucosamine and three mannose residues. This core glycan is then elaborated and modified further, resulting in a diverse range of N-glycan structures.^[4]

Synthesis of precursor oligosaccharide

The process of N-linked glycosylation starts with the formation of dolichol-linked GlcNAc sugar. Dolichol is a lipid molecule composed of repeating isoprene units. This molecule is found attached to the membrane of the ER. Sugar molecules are attached to the dolichol through a pyrophosphate linkage^[4] (one phosphate was originally linked to dolichol, and the second phosphate came from the nucleotide sugar). The oligosaccharide chain is then extended through the addition of various sugar molecules in a stepwise manner to form a precursor oligosaccharide.

The assembly of this precursor oligosaccharide occurs in two phases: Phase I and II.^[4] Phase I takes place on the cytoplasmic side of the ER and Phase II takes place on the luminal side of the ER.

The precursor molecule, ready to be transferred to a protein, consists of two GlcNAc, nine mannose, and three glucose molecules.

Phase I
Steps	Location
Two UDP-GlcNAc residues are attached to the dolichol molecule embedded in the ER membrane. The sugar and dolichol form a pyrophosphate linkage. Five GDP-Man residues are attached to the GlcNAc disaccharide .These steps are performed by glycosyltransferases. Product: Dolichol–GlcNAc₂–Man₅	Cytoplasmic side of ER
At this point, the lipid-linked glycan is translocated across the membrane making it accessible to enzymes in the endoplasmic reticulum lumen. This translocation process is still poorly understood, but it is suggested to be performed by an enzyme known as flippase.
Phase II
The growing glycan is exposed on the luminal side of the ER membrane and subsequent sugars (4 mannose and 3 glucose) are added. Dol-P-Man is the Mannose residue donor (formation: Dol-P + GDP-Man → Dol-P-Man + GDP) and Dol-P-Gluc is the glucose residue donor (formation : Dol-P + UDP-Glc → Dol-P-Glc + UDP). These additional sugars are transported into the lumen from the cytoplasm of the ER via attachment to the dolichol molecule and subsequent translocation into the lumen with the help of flippase enzyme. (Various dolichols in the membrane are used to translocate multiple sugars at once). Product: Dolichol–GlcNAc₂–Man₉–Glc₃	Luminal side of ER

Transfer of glycan to protein

Once the precursor oligosaccharide is formed, the completed glycan is then transferred to the nascent polypeptide in the lumen of the ER membrane. This reaction is driven by the energy released from the cleavage of the pyrophosphate bond between the dolichol-glycan molecule. There are three conditions to fulfill before a glycan is transferred to a nascent polypeptide:^[4]

Asparagine must be located in a specific consensus sequence in the primary structure (Asn–X–Ser or Asn–X–Thr or in rare instances Asn–X–Cys).^[5]
Asparagine must be located appropriately in the three-dimensional structure of the protein (Sugars are polar molecules and thus need to be attached to asparagine located on the surface of the protein and not buried within the protein)
Asparagine must be found in the luminal side of the endoplasmic reticulum for N-linked glycosylation to be initiated. Target residues are either found in secretory proteins or in the regions of transmembrane protein that face the lumen.

Oligosaccharyltransferase is the enzyme responsible for the recognition of the consensus sequence and the transfer of the precursor glycan to a polypeptide acceptor which is being translated in the endoplasmic reticulum lumen. N-linked glycosylation is, therefore, a co-translational event

Processing of glycan

N-glycan processing is carried out in endoplasmic reticulum and the Golgi body. Initial trimming of the precursor molecule occurs in the ER and the subsequent processing occurs in the Golgi.

Upon transferring the completed glycan onto the nascent polypeptide, two glucose residues are removed from the structure. Enzymes known as glycosidases remove some sugar residues. These enzymes can break glycosidic linkages by using a water molecule. These enzymes are exoglycosidases as they only work on monosaccharide residues located at the non-reducing end of the glycan.^[4] This initial trimming step is thought to act as a quality control step in the ER to monitor protein folding.

Once the protein is folded correctly, two glucose residues are removed by glucosidase I and II. The removal of the final third glucose residue signals that the glycoprotein is ready for transit from the ER to the cis-Golgi.^[4] ER mannosidase catalyses the removal of this final glucose. However, if the protein is not folded properly, the glucose residues are not removed and thus the glycoprotein can't leave the endoplasmic reticulum. A chaperone protein (calnexin/calreticulin) binds to the unfolded or partially folded protein to assist protein folding.

The next step involves further addition and removal of sugar residues in the cis-Golgi. These modifications are catalyzed by glycosyltransferases and glycosidases respectively. In the cis-Golgi, a series of mannosidases remove some or all of the four mannose residues in α-1,2 linkages.^[4] Whereas in the medial portion of the Golgi, glycosyltransferases add sugar residues to the core glycan structure, giving rise to the three main types of glycans: high mannose, hybrid and complex glycans.

High-mannose is, in essence, just two N-acetylglucosamines with many mannose residues, often almost as many as are seen in the precursor oligosaccharides before it is attached to the protein.
Complex oligosaccharides are so named because they can contain almost any number of the other types of saccharides, including more than the original two N-acetylglucosamines.
Hybrid oligosaccharides contain a mannose residues on one side of the branch, while on the other side a N-acetylglucosamine initiates a complex branch.

The order of addition of sugars to the growing glycan chains is determined by the substrate specificities of the enzymes and their access to the substrate as they move through secretory pathway. Thus, the organization of this machinery within a cell plays an important role in determining which glycans are made.

Enzymes in the Golgi

Golgi enzymes play a key role in determining the synthesis of the various types of glycans. The order of action of the enzymes is reflected in their position in the Golgi stack:

Enzymes	Location within Golgi
Mannosidase I	cis-Golgi
GlcNAc transferases	medial Golgi
Galactosyltransferase and Sialyltransferase	trans-Golgi

In archaea and prokaryotes

Similar N-glycan biosynthesis pathway have been found in prokaryotes and Archaea.^[6] However, compared to eukaryotes, the final glycan structure in eubacteria and archaea does not seem to differ much from the initial precursor made in the endoplasmic reticulum. In eukaryotes, the original precursor oligosaccharide is extensively modified en route to the cell surface.^[4]

Function

N-linked glycans have intrinsic and extrinsic functions.^[4]^[7]

Within the immune system, the N-linked glycans on an immune cell's surface will help dictate that migration pattern of the cell, e.g. immune cells that migrate to the skin have specific glycosylations that favor homing to that site.^[8] The glycosylation patterns on the various immunoglobulins including IgE, IgM, IgD, IgA, and IgG bestow them with unique effector functions by altering their affinities for Fc and other immune receptors.^[8] Glycans may also be involved in "self" and "non self" discrimination, which may be relevant to the pathophysiology of various autoimmune diseases.^[8]

Functions of N-linked glycans
Intrinsic	Provides structural components to the cell wall and extracellular matrix. Modify protein properties such as stability and solubility^[9] (more stable to high temperature, pH, etc.). Protects proteins against aggregation.^[10]
Extrinsic	Directs trafficking of glycoproteins. Mediates cell signalling (cell–cell and cell–matrix interactions).

In some cases, interaction between the N-glycan and the protein stabilizes the protein through complex electronic effects.^[11]

Clinical significance

Changes in N-linked glycosylation has been associated with different diseases including rheumatoid arthritis,^[12] type 1 diabetes,^[13] Crohn's disease,^[14] and cancers.^[15]^[16]

Mutations in eighteen genes involved in N-linked glycosylation result in a variety of diseases, most of which involve the nervous system.^[3]^[16]

Importance in therapeutic proteins

Many therapeutic proteins in the market are antibodies, which are N-linked glycoproteins. For example, Etanercept, Infliximab and Rituximab are N-glycosylated therapeutic proteins.

The importance of N-linked glycosylation is becoming increasingly evident in the field of pharmaceuticals.^[17] Although bacterial or yeast protein production systems have significant potential advantages such as high yield and low cost, problems arise when the protein of interest is a glycoprotein. Most prokaryotic expression systems such as E. coli cannot carry out post-translational modifications. On the other hand, eukaryotic expression hosts such as yeast and animal cells, have different glycosylation patterns. The proteins produced in these expression hosts are often not identical to human protein and thus, cause immunogenic reactions in patients. For example, S.cerevisiae (yeast) often produce high-mannose glycans which are immunogenic.

Non-human mammalian expression systems such as CHO or NS0 cells have the machinery required to add complex, human-type glycans. However, glycans produced in these systems can differ from glycans produced in humans, as they can be capped with both N-glycolylneuraminic acid (Neu5Gc) and N-acetylneuraminic acid (Neu5Ac), whereas human cells only produce glycoproteins containing N-acetylneuraminic acid. Furthermore, animal cells can also produce glycoproteins containing the galactose-alpha-1,3-galactose epitope, which can induce serious allergenic reactions, including anaphylactic shock, in people who have Alpha-gal allergy.

These drawbacks have been addressed by several approaches such as eliminating the pathways that produce these glycan structures through genetic knockouts. Furthermore, other expression systems have been genetically engineered to produce therapeutic glycoproteins with human-like N-linked glycans. These include yeasts such as Pichia pastoris,^[18] insect cell lines, green plants,^[19] and even bacteria.

References

^ "Glycosylation". UniProt: Protein sequence and functional information.
^ Imperiali B, O'Connor SE (December 1999). "Effect of N-linked glycosylation on glycopeptide and glycoprotein structure". Current Opinion in Chemical Biology. 3 (6): 643–9. doi:10.1016/S1367-5931(99)00021-6. PMID 10600722.
^ ^a ^b Patterson MC (September 2005). "Metabolic mimics: the disorders of N-linked glycosylation". Seminars in Pediatric Neurology. 12 (3): 144–51. doi:10.1016/j.spen.2005.10.002. PMID 16584073.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p Drickamer K, Taylor ME (2006). Introduction to Glycobiology (2nd ed.). Oxford University Press, USA. ISBN 978-0-19-928278-4.
^ Mellquist JL, Kasturi L, Spitalnik SL, Shakin-Eshleman SH (May 1998). "The amino acid following an Asn–X–Ser/Thr sequon is an important determinant of N-linked core glycosylation efficiency". Biochemistry. 37 (19): 6833–7. doi:10.1021/bi972217k. PMID 9578569.
^ Dell A, Galadari A, Sastre F, Hitchen P (2010). "Similarities and differences in the glycosylation mechanisms in prokaryotes and eukaryotes". International Journal of Microbiology. 2010: 1–14. doi:10.1155/2010/148178. PMC 3068309. PMID 21490701.
^ GlyGen. "GlyGen glycan structure dictionary". GlyGen. Retrieved 1 Apr 2021.
^ ^a ^b ^c Maverakis E, Kim K, Shimoda M, Gershwin ME, Patel F, Wilken R, et al. (February 2015). "Glycans in the immune system and The Altered Glycan Theory of Autoimmunity: a critical review". Journal of Autoimmunity. 57 (6): 1–13. doi:10.1016/j.jaut.2014.12.002. PMC 4340844. PMID 25578468.
^ Sinclair AM, Elliott S (August 2005). "Glycoengineering: the effect of glycosylation on the properties of therapeutic proteins". Journal of Pharmaceutical Sciences. 94 (8): 1626–35. doi:10.1002/jps.20319. PMID 15959882.
^ "N-glycosylation as a eukaryotic protective mechanism against protein aggregation". Science Advances. 10 (5). 2024. doi:10.1126/sciadv.adk8173. PMC 10830103.
^ Ardejani, Maziar S.; Noodleman, Louis; Powers, Evan T.; Kelly, Jeffery W. (2021-03-15). "Stereoelectronic effects in stabilizing protein– N -glycan interactions revealed by experiment and machine learning". Nature Chemistry. 13 (5): 480–487. doi:10.1038/s41557-021-00646-w. ISSN 1755-4349. PMC 8102341. PMID 33723379.
^ Nakagawa H, Hato M, Takegawa Y, Deguchi K, Ito H, Takahata M, et al. (June 2007). "Detection of altered N-glycan profiles in whole serum from rheumatoid arthritis patients". Journal of Chromatography B. 853 (1–2): 133–7. doi:10.1016/j.jchromb.2007.03.003. hdl:2115/28276. PMID 17392038.
^ Bermingham ML, Colombo M, McGurnaghan SJ, Blackbourn LA, Vučković F, Pučić Baković M, et al. (January 2018). "N-Glycan Profile and Kidney Disease in Type 1 Diabetes". Diabetes Care. 41 (1): 79–87. doi:10.2337/dc17-1042. hdl:20.500.11820/413dce5a-e852-4787-aac9-62c2c6d4389f. PMID 29146600.
^ Trbojević Akmačić I, Ventham NT, Theodoratou E, Vučković F, Kennedy NA, Krištić J, et al. (June 2015). "Inflammatory bowel disease associates with proinflammatory potential of the immunoglobulin G glycome". Inflammatory Bowel Diseases. 21 (6): 1237–47. doi:10.1097/MIB.0000000000000372. PMC 4450892. PMID 25895110.
^ Kodar K, Stadlmann J, Klaamas K, Sergeyev B, Kurtenkov O (January 2012). "Immunoglobulin G Fc N-glycan profiling in patients with gastric cancer by LC-ESI-MS: relation to tumor progression and survival". Glycoconjugate Journal. 29 (1): 57–66. doi:10.1007/s10719-011-9364-z. PMID 22179780. S2CID 254501034.
^ ^a ^b Chen G, Wang Y, Qin X, Li H, Guo Y, Wang Y, et al. (August 2013). "Change in IgG1 Fc N-linked glycosylation in human lung cancer: age- and sex-related diagnostic potential". Electrophoresis. 34 (16): 2407–16. doi:10.1002/elps.201200455. PMID 23766031. S2CID 11131196.
^ Dalziel M, Crispin M, Scanlan CN, Zitzmann N, Dwek RA (January 2014). "Emerging principles for the therapeutic exploitation of glycosylation". Science. 343 (6166): 1235681. doi:10.1126/science.1235681. PMID 24385630. S2CID 206548002.
^ Hamilton SR, Bobrowicz P, Bobrowicz B, Davidson RC, Li H, Mitchell T, et al. (August 2003). "Production of complex human glycoproteins in yeast". Science. 301 (5637): 1244–6. doi:10.1126/science.1088166. PMID 12947202. S2CID 38981893.
^ Strasser R, Altmann F, Steinkellner H (December 2014). "Controlled glycosylation of plant-produced recombinant proteins". Current Opinion in Biotechnology. 30: 95–100. doi:10.1016/j.copbio.2014.06.008. PMID 25000187.

Metabolism, catabolism, anabolism

General

Energy
metabolism

Aerobic respiration	Glycolysis → Pyruvate decarboxylation → Citric acid cycle → Oxidative phosphorylation (electron transport chain + ATP synthase)
Anaerobic respiration	Electron acceptors other than oxygen
Fermentation	Glycolysis → Substrate-level phosphorylation ABE Ethanol Lactic acid

Specific
paths

Protein metabolism

Protein synthesis
Catabolism (protein→peptide→amino acid)

Amino acid	Amino acid synthesis Amino acid degradation (amino acid→pyruvate, acetyl CoA, or TCA intermediate) Urea cycle
Nucleotide metabolism	Purine metabolism Nucleotide salvage Pyrimidine metabolism Purine nucleotide cycle

Carbohydrate metabolism
(carbohydrate catabolism
and anabolism)

Human

Glycolysis ⇄ Gluconeogenesis
Glycogenolysis ⇄ Glycogenesis
Pentose phosphate pathway Fructolysis Polyol pathway Galactolysis Leloir pathway
Glycosylation N-linked O-linked

Nonhuman

Photosynthesis Anoxygenic photosynthesis Chemosynthesis Carbon fixation DeLey-Doudoroff pathway Entner-Doudoroff pathway
Xylose metabolism Radiotrophism

Lipid metabolism
(lipolysis, lipogenesis)

Fatty acid metabolism	Fatty acid degradation (Beta oxidation) Fatty acid synthesis
Other	Steroid metabolism Sphingolipid metabolism Eicosanoid metabolism Ketosis Reverse cholesterol transport

Other