Preloader

N-glycosylation profiles of the SARS-CoV-2 spike D614G mutant and its ancestral protein characterized by advanced mass spectrometry

Preparation and analysis of recombinant S-614G and S-614D proteins

The glycosylation profiles of recombinant ectodomain of the SARS-CoV-2 S-614G and its progenitor S-614D were examined. To provide accurate comparison, two recombinant proteins were expressed under identical experimental conditions. Constructs of the two isogenic proteins included two widely used substitutions: a double proline mutations at residues 986 and 987 to stabilize the prefusion conformation and amino acids RRAR (position 682–685) were mutated to GSAS to disrupt the furin cleavage site between S1 and S2 subunits2, which aids purification of the whole S ectodomain. Figure 1 is the 1D gel of purified S-614D and S-614G along with the results of an affinity pulldown of the two spike variants with ACE2 and with monoclonal antibodies. The 1D SDS-PAGE gels show the purity of both S-614D and S-614G and their ability to bind to ACE2 receptor and the two monoclonal antibodies34 against the SARS-CoV-2 spike receptor binding domain, indicating these two recombinant proteins are functionally active with proper structures suitable for this study.

Figure 1
figure1

SDS gel analysis of the SARS-CoV-2 S proteins interacting with other proteins. (A) Binding of the S to the biotinylated ACE2. lane 1: S-614D; lane 2: S-614D + ACE2; lane 3: S-614G; lane 4: S-614G + ACE2. (B) Binding of the S to monoclonal antibodies. Lane 1–6: S-614D, S-614D + 3G7, S-614D + 3A2, S-614G, S-614G + 3G7, and S-614G + 3A2. The gels were run under reduced condition and visualized with SYPRO Ruby stains.

The site-specific distribution and abundance of heterogeneous N-linked glycans on 21 of 22 potential sequons were characterized by mass spectrometry analysis of the glycopeptides cleaved by the a-lytic protease or the combination of trypsin and chymotrypsin using the same EThcD instrument parameters described previously (N149, glycosylated peptides were not of sufficient quality for quantification). For direct comparison, the ion intensity of the precursor MS1 peak of each glycopeptide was used to represent the abundance of each individual glycosylation form on a specific sequon of the proteins (Fig. 2). The relative abundance of three major types of the N-glycans: high-mannose (HexNAc2Hex>4X, green), hybrid (3 HexNAc, purple), and complex (> 3 HexNAc, gray), on the N of each sequon of S-614D and S-614G proteins were depicted in the inserted pie charts.

Figure 2
figure2figure2

Comparison of N-linked glycosylation profiles in ectodomain spike protein between S-614D (blue) and S-614G (orange) samples. Glycosylation abundances were calculated from one representative native peptide sequence for glycan sites (A) N17, (B) N61, (C) N74, (D) N122, (E) N165, (F) N234, (G) N282, (H) N331, (I) N343, (J) N603, (K) N616, (L) N657, (M) N709, (N) N717, (O) N801, (P) N1074, (Q) N1098, (R) N1134, (S) N1158, and (T) N1173. Inserted pie charts (upper: 614D; lower: 614G) depict the relative composition of high-mannose (green), hybrid (purple), and complex (gray) types of glycoforms. In the short names of individual glycans, N, H, A, and F symbolize HexNAc, Hex, NeuAc, and Fuc, respectively. X-axis represent individual glycans.

Glycosylation sites (sequons) containing unchanged glycan profiles between two S proteins

The N-glycosylation composition on the ectodomain of the SARS-CoV-2 S-614D protein was similar with those detected on the recombinant SARS-CoV-2 S proteins by different laboratories, including ours6,29,33,35. Analysis of the S-614G variant, however, revealed differences in the N-glycan on some glycosylation sites from those seen in S-614D. Among 21 detected and quantified sequons, 9 of these sequons including N17, N61, N74, N331, N343, N657, N1074, N1158, and N1173 had little to no significant variations in the distribution of both individual glycans and glycan types between S-614D and S-614G protein expressions while alterations were observed on 11 of sequons that include N122, N165, N234, N282, N603, N616, N709, N717, N801, N1098, and N1134 (Fig. 2). Within each of the unchanged glycosylation sites, not only did the numbers and forms of heterogeneous glycans remain unchanged, but their relative abundances also remained unchanged.

The unchanged glycosylation sites spanned the entire surface of the S trimers—head and stalk regions, or S1 and S2 subunits (Fig. 3). N1158 and N1173 are two N-glycosylation sites that reside in the stalk region or C-terminal portion of the S2 (membrane fusion subunit) subunit proximal to the viral membrane. Conserved glycosylation on these three sites observed between the two variants in our study suggests that the shielding of the stalk region by complex glycans on the S-614G variant virus remains intact.

Figure 3
figure3

(A) Schematic SARS-CoV-2 S protein primary structure. (B) N-glycans depicted on a representative full-length, fully-glycosylated prefusion conformation of the trimetric SARS-CoV-2 spike protein (file 6vsb 1 1 1.pdb from the CHARMM-GUI Archive displayed in PyMOL)36,37. Blue-colored glycans indicate no change in the glycosylation site between the S-614G mutant and the S-614D wild type. Magenta-colored glycans indicate a modification in the glycan distribution and type between the mutant and wild type. The RBD is shown in green. The N149 and N1194 glycans are gold. The glycans depicted do not necessarily match those described in this report, and the O-linked glycans in the model are hidden due to low occupancy.

Among the 7 unchanged sites in the head region, three (N17, N61, and N74) were located at the N-terminal portion of the receptor binding domain, implying that the site-specific glycosylations on this distal portion of the receptor binding domain may not be involved in enhanced binding affinity of S-614G to hACE2 receptor. N331 and N343 are the only two residues in the RBD that are modified by N-glycans. However, they are not located within the receptor binding motif of the RBD and do not directly interact with hACE2. Our analysis revealed that N-glycan microheterogeneities on each of these two sites did not change between S-614G and S-614D (Fig. 2H, I), suggesting that the glycan complement on these sites does not affect the RBD-ACE2 binding directly or indirectly, and may only provide protection for the RBD region. The effect of the glycosylation of these two sites, if any, on any differences between the two S proteins observed in bioassays38 requires further investigation.

Glycosylation sites with altered N-glycan profiles between S-614G and wild-type S protein

All sequons bearing glycan variations between spike protein 614D and 614G reside in the head region of the S proteins with some in the S1 subunit (N122, N165, N234, N282, N603, and N616) and others in the top half of the S2 subunit (N709, N717, N801, N1098, and N1134) at the lower portion of the head (Fig. 3). It is interesting that nearly all sequons in the lower portion of the head showed significant difference in glycan content between S-614D and 614G, in contrast to the observations in the top head area (S1 subunit) where only half of the sequons displayed a change in glycosylation forms (N1074 was the exception) (Fig. 2). There might be two possible reasons for this phenomenon. One is the adaptability of the N-glycan shielding layer to the structural changes in the original protein caused by the S-D614G substitution. Another possible reason could be that the N-glycosylation in this part of the S2 subunit may play a role in viral membrane fusion, because the fusion peptide is buried in the prefusion structure, and S2 is responsible for virus-host-cell membrane fusion. Further investigation is needed to address these hypotheses.

Based on the scope of alteration, these altered glycosylation sites can be categorized into two major groups: (1) with increased high-mannose and decreased hybrid and complex glycans on the sites of N122, N234, N603, N709, and N801; (2) increased high-mannose and hybrid glycoforms and decreased complex glycans on N165, N282, N616, N1098 and N1134. Alteration on N717 showed a different trend from the other sequons. Although only glycosylated peptides were detected and the non-glycosylated N717 was not observed, the ion signals of the N717 glycopeptides were significantly reduced. This site was mainly occupied by oligomannose with little hybrid and no complex content (Table 1). Variations occurred on two major oligomannoses, Man5 and Man6, but their mass spectral abundances were reduced by approximately eight- and four-fold, respectively, when the aspartic acid at 614 was substituted for a glycine residue (Fig. 2N). Interestingly, the Cryo-EM structure of two proteins (PDB: S-614D-6vsb, 614G-6xs6) show significant differences in the secondary structures proximal to the N717 residue, which could affect the enzymatic digestion for the spike protein near this site and lead to diminished ion signals for the glycopeptides.

Table 1 Relative abundance (%) of the N-glycans on some of the sequons of the recombinant SARS-CoV-2 Spike D614G mutant and its ancestor.

The relative abundance of complex-type glycans at all altered N-glycan sites, (except N717, which carries no complex glycans on either S-614D or S-614G), were reduced, often significantly (i.e., 13–45%) (Table 1). While 6 of 11 sequons, including N165, N282, N603, N616, N1098, and N1134, were occupied by more than fifty percent of the complex glycans present on the S-614D protein, only one of them (N282) maintained 65% population of complex glycans after the S-D614G substitution. Meanwhile, the number of the sequons bearing more than 50% high-mannose glycans increased from 3 (N234, N709, and N801) to 6 (N122, N165, N234, N603, N709, and N801). This phenomenon should not be caused by lack of processing enzymes because the two proteins were expressed in identical Expi293F cells at the same time under the same conditions. The increase of under-processed oligomannose glycans and the decrease of fully processed complex glycoforms in these S-614G sequons implies that less dense glycans in the spike variant might be sufficient to maintain the protein structure (or facilitate protein folding).

The closest glycosylation site to the S-D614G substitution is N616, as it resides only two amino acid residues downstream of the substitution site. Glycosylation patterns between the S-614D and S-614G proteins significantly varied at N616 (Fig. 2K). This site in S-614D was occupied predominantly by several fucose-containing bi-antennary or tri-antennary complex glycans including HexNAc4Hex5Fuc1, HexNAc4Hex3Fuc1, and HexNAc5Hex6Fuc1 with only 12% high-mannose glycans (Fig. 2K, Table 1). However, when S-614D is substituted by the smallest amino acid, glycine, the percentage of complex glycans was reduced by almost half of its original occupancy and the content of oligomannose increases by 30% and hybrid by 14%. The smallest high-mannose glycan, HexNAc2Hex5 (the high-mannose glycans were represented as Man5–Man9 thereafter), became the most abundant glycan at N616 in S-614G protein and the MS1 peak intensities of this glycan elevated by three-fold while the intensity of the most abundant complex-type glycan, HexNAc4Hex5Fuc1, in S-614D decreases by more than two-fold. Although the intensity values might not reflect real changes in abundance because they were derived from two peptides, the trends of glycan distribution should not be affected significantly. Based on the Cryo-EM structures, it has been proposed that S-D614G substitution allosterically leads to more “open” conformations or a higher percentage of RBD in a “up” position that facilitates the interaction of RBD with the ACE2 receptor13,39. Lower glycan complexity at the N616 site might adapt or compensate the changes in protein structure within the allosteric pathway and render the allostery effect on the mutated S protein.

Similar changes occurred on N603 (Fig. 2J) and N165 (Fig. 2E), where more than 30% increase in high-mannose and decrease in complex glycans were observed but the content of hybrid glycans did not change as much as the complex or high mannose structures. Considering the proximity of N603 to the S-D614G substitution site, the glycan variation on N603 might have similar effect as that on the N616. MD simulation using a Man5 oligomannose glycan at N165 has revealed that the glycan plays an important role in modulating the conformational “Up” and “Down” transitions of the S RBD by occupying the space between the RBD and NTD regions at an “Up” position7. Our data showed a large decrease in the complex glycan content and an increase in oligomannose glycans with Man5 becoming the most abundant glycan in the S-614G variant. Further study is required to illustrate the effect of varied N-glycosylation on this particular site.

N234 is another glycosylation site within the NTD region that has been proposed to modulate the RBD conformational dynamics7. MD simulation indicates that a Man9 high-mannose N-glycan at this site fills the vacancy left by an opened RBD reaching to the apical core of the trimer and stabilizing the receptor binding domain of S protein in the “up” or “open” conformation. Our data showed that the N234 was predominantly occupied by five high-mannose glycans in both S-614D and S-614G but their distributions were different. N234 in the S-614G was dominated by Man8 and Man9 but highest abundant glycan was Man8 in the S-614D and four other high-mannose glycans (Man5, Man6, Man7, and Man9) had relatively similar concentrations (Fig. 2F). How this change will affect the binding of RBD to human ACE2 receptor requires more investigation using biological or biophysical methods.

N709 and N801 in Group 1 are two oligomannose dominated sequons that displayed similar variation between the S-614D and S-614G proteins (Fig. 2M, O). Whereas the combined population of complex and hybrid N-glycans declined approximately 20%, significant changes appeared within the four individual high-mannose glycans ranging from Man5 to Man8. Three of the sites had more abundant high mannose glycans in S-614G than in S-614D proteins. N122 is the last member in Group 1 with elevated high-mannose and reduced hybrid and complex glycans. The main changes occur on one of the high-mannose glycan (Man5) where its abundance in S-614G increased by 50% in the S mutant and was much higher than that of the other glycans (Fig. 2D). Because these two residues are located downstream from the S1/S2 cleavage site, reduced complexity of their micro-heterogeneous glycoforms might be related to enhanced protease susceptibility at the furin cleavage site determined by bioassays39.

In addition to N165 and N616, Group 2 includes three other sequons, N282, N1098 and N1134, with the sites originally occupied predominantly by highly processed complex-type in the S-614D proteins, suggesting that further processing beyond the high-mannose forms was favored on these sites. Unlike the members in Group 1, reduced populations of complex glycans were compensated by increased hybrid populations (Table 1). In contrast to some sites where a few glycoforms dominated the occupancy, the abundances of various glycoforms were distributed more evenly in both S-614D and S-614G S proteins (Fig. 2G, Q, R). Since N1098 and N1134 are located in the C-terminal S2 domain and far from the RBD and NTD domains, the glycosylation of these sites might not affect interactions between spike’s RBD and human ACE2 receptors.

Source link