Preloader

Integrated analysis of plasma and single immune cells uncovers metabolic changes in individuals with COVID-19

Multiomics analysis of the peripheral immune response

We evaluated plasma metabolomic profiles and transcriptional networks within circulating immune cells identified through single-cell RNA-seq (scRNA-seq) analysis (Methods). These profiles were collected from 374 blood samples from 198 individuals with COVID-19 with varying disease severities4 (Fig. 1a and Supplementary Table 1; severity measured on the World Health Organization (WHO) ordinal scale for COVID-19, Supplementary Table 5). To define the early trajectory of acute illness, for each individual, data were collected near clinical diagnosis (T1) and several days later (T2) when individuals were still symptomatic in the acute phase.

Fig. 1: Overview of metabolic analysis of COVID-19 peripheral immune response.
figure1

a, Overall workflow of metabolic analysis; PBMCs, peripheral blood mononuclear cells; ICU, intensive care unit; WHO, World Health Organization. b, Plasma metabolite levels per individual sample correlated with severity on the WHO ordinal scale per sample. Metabolites are listed in descending order of two-sided Spearman’s rank correlation value. Metabolites of particular interest are labeled and colored by category; red, metabolites associated with canonical energy-producing metabolic pathways; green, amino acids and products of amino acid metabolism; purple, linoleic acid-containing lipids; blue, sphingolipids. The solid vertical line intersecting the heat map indicates separation between non-hospitalized versus hospitalized individuals, and the horizontal line indicates separation between metabolite levels positively and negatively correlated with WHO score. Only the most significantly correlated metabolites are shown (P < 0.001 with Benjamini–Hochberg false discovery rate (FDR) correction). Significant P values are summarized in Supplementary Table 2. c, Known metabolic pathways affected by plasma metabolites that are positively correlated with WHO score. The red dashed line indicates threshold of statistical significance on the y axis (P < 0.05) calculated by hypergeometric test. Exact P values are listed in Supplementary Table 7. Biosynth, biosynthesis; unsat’d, unsaturated; FAs, fatty acids.

Plasma metabolites distinguish varying disease severities

We first examined plasma metabolites to obtain a broad view of metabolic status. Of the 1,050 metabolites measured per individual, 227 and 510 were significantly (P < 0.05) positively and negatively correlated with disease severity, respectively (Fig. 1b and Supplementary Table 2). Phenylalanine, which is a mortality risk marker in severe infection5, phenylalanine-related metabolites (for example, N-formylphenylalanine, also associated with rhinovirus6) and previously reported kynurenine3 were each positively correlated with disease severity. Mannose also strongly correlated with severity and may be derived from oligomannose residues of SARS-CoV-2 spike protein7, potentially reflecting elevated viral loads, or may indicate complement pathway activation8. Also positively correlated with disease severity were glucose, supporting the importance of glycolytic/gluconeogenic processes and anaerobic metabolism in COVID-19 (ref. 9), and inflammation-associated lipids, such as linoleic acid10. Thus, increased COVID-19 severity is associated with metabolite alterations, suggesting increased immune-related activity.

We next performed metabolic pathway enrichment analysis of metabolites that correlated with severity. The positively correlated metabolites were enriched for metabolism of linoleic acid and ketone bodies (Fig. 1c). Ketone bodies increased with severity, potentially indicating a developing catabolic state. Metabolites that negatively correlated with severity were enriched for metabolism of sphingolipids and glycerophospholipids (Extended Data Fig. 1a), components of lipid membranes critical for cytokine synthesis11 and stability against hypoxic stress12. Within the alanine, aspartate and glutamate metabolic pathway, glutamine is negatively correlated with disease severity, suggesting increased uptake that sustains proliferation of adaptive immune cells13. Thus, altered circulating metabolite levels may reflect altered metabolic processes generating components of a COVID-19 immune response.

We next tested whether plasma metabolite levels could distinguish disease severity between individuals by principal component analysis (PCA). A distinct separation of individuals with mild COVID-19 from individuals with more severe disease was evident along PC1 (Extended Data Fig. 1b), which comprises mannose, linoleic acid-containing glycerophospholipids and sphingomyelins as top contributors.

Taken together, COVID-19 severity is associated with many changes in plasma metabolite levels, which can be ascribed to distinct metabolic pathways that potentially impact immune responses. We next turn to integrating these disease-altered metabolite profiles with other omics measures.

Metabolism by immune cells at whole-PBMC resolution

We evaluated peripheral immune cell metabolism using single-cell gene expression-based inferences (Extended Data Fig. 1c) and metabolic flux. We considered all PBMCs together and identified metabolic pathway activities that correlated with disease severity (Fig. 2a). For example, we identified that glutathione metabolism within immune cells positively correlated with disease severity, suggesting that cellular metabolism may be compensating for increasing levels of oxidative stress to maintain immune cell self-renewal capacity14. Catabolism of arachidonic acid, a critical precursor to prostaglandins and leukotrienes10, negatively correlated with disease severity, suggesting decreased inflammatory cytokine production and thus decreasingly effective immune responses. This analysis suggests that metabolic reprogramming of immune cells accompanies COVID-19.

Fig. 2: Total metabolic changes with changes in disease severity from T1 to T2.
figure2

a, Metabolic pathway activities for all PBMCs pooled together that are significantly correlated with WHO score by two-sided Spearman’s rank correlation (P < 0.05; labeled on right). Exact P values are listed in Supplementary Table 7. b, UMAP for dimensional reduction of scRNA-seq data of peripheral immune cells collected from each individual. UMAP was performed separately using only genes included in metabolic pathways (left) and using all genes (right). c, Changes in metabolic pathway activities over time (from T1 to T2) per peripheral immune cell type for individuals whose WHO scores worsened, improved or remained stable. A relative activity of 0 indicates no change in the activity of a metabolic pathway from T1 to T2.

Immune cell types are metabolically compartmentalized

To assess metabolic networks within individual PBMCs, we performed uniform manifold approximation and projection (UMAP) based on the 1,387 genes involved in known metabolic pathways (classical immune markers were excluded). Remarkably, this resolved major immune cell types, similar to an analysis using all genes from our scRNA-seq dataset (Fig. 2b). This indicates that each major immune cell type has a distinct metabolic signature.

We further explored immune cell-type-specific gene expression-based metabolic pathway activity15 and metabolic flux by extracting pathway activity changes with respect to disease progression from T1 to T2. Individuals with increased disease severity at T2 showed a generally more activated metabolic profile, but the extent of activation was distinct per cell type (Fig. 2c). CD8+ T cells and B cells had the most increased metabolic activity, consistent with their major antiviral activities, while CD4+ T cells, natural killer (NK) cells and monocytes each exhibited relatively less elevation. Among clinically improved individuals, B cells had the greatest decrease in metabolic activity, potentially reflecting decreased activity with reduced antigen load, while NK cell activity slightly increased. Thus, examining metabolic activity at the resolution of individual cell types reveals insights not apparent from whole-PBMC analysis.

We further performed metabolic flux analysis16 by integrating scRNA-seq data with genome-scale reconstruction of human metabolism (Methods). Reaction fluxes for each immune cell type per individual yielded three cell clusters (Extended Data Fig. 1d,e). Cluster 2 exclusively contained monocytes, suggesting distinct metabolic programs between the myeloid (monocytes) and lymphoid (T, B and NK cells) lineages. This cluster exhibited significantly lower nucleotide synthesis and increased eicosanoid metabolism (mediating inflammation17) and glycolysis/gluconeogenesis, suggesting reduced proliferation but sustained proinflammatory activity. Cluster 0 was composed of NK cells, B cells and CD4+ and CD8+ T cells (Extended Data Fig. 1e). Cluster 1, containing mostly CD4+ and CD8+ T cells, was metabolically similar to cluster 0 but had significantly higher extracellular transport (that is, SLC7A6-mediated amino acid transport) and nucleotide interconversion (Supplementary Fig. 1a), potentially indicating increased intercellular signaling and cell proliferation, respectively. Thus, flux-defined clusters 0 and 1 may contain naive and activated lymphocytes, respectively. By contrast, when we calculated reaction fluxes at the whole-PBMC level rather than at the cell-type level, no discrete clusters formed (Supplementary Fig. 1b). This highlights major metabolic differences between immune cell types (and potential metabolic subtypes within each cell type) and underscores the importance of analyzing immune cell types individually. The relationship of these subtypes to immune functions is a key question.

Metabolically hyperactive CD8+ T cell subpopulation emerges

We further investigated the metabolic heterogeneity within individual cell types at single-cell resolution starting with CD8+ T cells. Clustering and UMAP projections with metabolic genes alone separated CD8+ T cells into seven metabolically defined clusters (Fig. 3a and Extended Data Fig. 2a). These metabolic clusters were consistent with clusters previously identified4 using the expression levels of all measured genes (Fig. 3b) and so were phenotypically distinct as defined by classical phenotype markers (Fig. 3c and Extended Data Fig. 2b–e). As examples, metabolic clusters 0, 4 and 5 highly express effector genes GZMB, GNLY and PRF1; thus, we labeled them as effector-like cells. Cluster 4, while of low abundance, notably increases with disease severity (Fig. 3d).

Fig. 3: Heterogeneous metabolic activity of CD8+ T cells is dominated by small, highly active subpopulations.
figure3

a,b, Unsupervised clustering and UMAP of CD8+ T cells using the expression of only genes involved in known metabolic pathways. Clusters defined from the expressions of only metabolic genes (a) and from the expressions of all genes (b) are overlaid. c, Heat map of mean expression of key phenotypic genes and proteins for each metabolic cluster in pseudobulk. d, Frequency of metabolically defined CD8+ T cell clusters per individual grouped by WHO score-based categories of disease severity; mild/healthy (WHO 0–2), moderate disease (WHO 3–4) and severe disease (WHO 5–7). Statistical significances of population increases or decreases were calculated by two-sided Spearman’s rank correlation. e, Metabolic pathway activities that are significantly correlated with WHO score by two-sided Spearman’s rank correlation (P < 0.05; right) for all CD8+ T cells per individual and for only effector CD8+ T cells (metabolic clusters 0, 4 and 5), each in pseudobulk. Exact P values are in Supplementary Table 7. f, Kernel density estimates of expressions of genes included in key energy-producing metabolic pathways for effector CD8+ T cells (metabolic clusters 0, 4 and 5) among individuals with mild disease and/or healthy individuals (WHO 0–2) and individuals with severe disease (WHO 5–7) in pseudobulk and separately for each of the metabolic clusters 0, 4 and 5. g, Comparison of metabolic pathway activities between CD8+ T cell metabolic clusters 0, 4 and 5 for all samples combined. h, Expression of T cell proliferation marker MKI67 and replicative senescence marker B3GAT1 (encoding CD57) in each metabolically defined cluster of CD8+ T cells (metabolic cluster 0, n = 8,525 cells; 1, n = 5,983; 2, n = 5,270; 3, n = 4,230; 4, n = 1,008; 5, n = 475; 6, n = 475). Boxes indicate 25th, 50th and 75th percentiles. Whiskers indicate 1.5 interquartile ranges below and above the 25th and 75th percentiles, respectively. i, Grouping of lymphocyte interactions by paratope hotspots (GLIPH) analysis of cluster 4 TCRs with published antigen-specific TCRs; left, pie chart illustrating fractions of TCRs from cluster 4 cells that are within the same GLIPH group with published SARS-CoV-2-specific TCRs; right, one representative GLIPH group that contains SARS-CoV-2-specific TCRs and cluster 4 cell TCRs.

We examined CD8+ T cell metabolic pathways with cluster-specific trends (Fig. 3e). For the effector-like metabolic clusters 0, 4 and 5, we resolved that almost all metabolic pathways positively correlated with WHO score. By contrast, relatively few metabolic pathways correlated when analyzing all CD8+ T cells pooled together. The energy-producing pathways of glycolysis/gluconeogenesis, oxidative phosphorylation and fatty acid degradation and biosynthesis were activated among these effector-like cells (especially cluster 4) but not among all CD8+ T cells. (Fig. 3f and Supplementary Table 6). Indeed, metabolic cluster 4 was by far the most active relative to all other clusters (Fig. 3g and Extended Data Fig. 2f) and so was defined as metabolically dominant. This was corroborated by flow cytometry, where metabolic cluster 4 resembled Ki-67+ proliferative-exhausted cells and exhibited significantly elevated hexokinase 2 (HK2) protein levels compared to other CD8+ T cells (Extended Data Fig. 3a–f). While all effector-like clusters 0, 4 and 5 expressed PRF1 and PRDM1, indicating effector differentiation18, only metabolic cluster 4 had high expression of MKI67 and low expression of B3GAT1 (encoding CD57 protein), suggesting minimal replicative senescence and antigen-induced T cell apoptosis19 (Fig. 3h) despite also expressing high levels of PD-1. Comparing T cell antigen receptor (TCR) sequences from this hyperactive cluster 4 with previously published SARS-CoV-2-specific TCR sequences (Fig. 3i) suggested that this cluster is SARS-CoV-2 specific. Thus, metabolic effector-like clusters 0 and 5 were relatively terminally differentiated, while cluster 4 comprised proliferative-exhausted, effector-like cells that were likely SARS-CoV-2 specific. In summary, CD8+ T cells include a small, metabolically hyperactive subpopulation, presumably SARS-CoV-2 specific, that exhibits increasing metabolic activity as disease severity increases (Extended Data Fig. 3g).

We interrogated for similar, metabolically dominant CD8+ T cells in sepsis20 and human immunodeficiency virus (HIV)21 using public scRNA-seq data (Supplementary Table 1). The metabolically hyperactive, proliferative-exhausted subpopulation comprises a small fraction of CD8+ T cells in both COVID-19 and sepsis but constitutes a larger, albeit less, metabolically active fraction in HIV (Extended Data Fig. 4a,b). This is consistent with the dysfunctional state of expanded CD8+ T cells in HIV22. Differential expression analysis revealed 151 uniquely highly expressed metabolic genes in this subpopulation (Extended Data Fig. 4c and Supplementary Table 3), with elevated processes related to amino acid metabolism and protein synthesis and targeting (Extended Data Fig. 4d).

Metabolic balance between CD4+ T cell subpopulations

CD4+ T cells also exhibited marked metabolic heterogeneity with eight metabolically defined clusters (Extended Data Fig. 5a) associated with distinct phenotypes. Similar to metabolic cluster 4 CD8+ T cells, metabolic cluster 6 CD4+ T cells were the most metabolically active in severe disease (Extended Data Fig. 5d–g) and exhibited intermediate expression of metabolic regulators MYC, HIF1A, MAPK1 and PRKAA1 as well as proliferative markers MKI67 and TYMS (Extended Data Fig. 5a–d). A population of cytotoxic CD4+ T cells, metabolic cluster 4, exhibited a high degree of clonal expansion (Extended Data Fig. 5c), intermediately expressed HIF1A, MAPK1 and PRKAA1 and highly expressed cytolytic markers GZMB, GNLY and PRF1 akin to effector CD8+ T cells.

We examined metabolic trends among CD4+ T cell subpopulations (Extended Data Fig. 5g). Proliferative cluster 6 exhibited increased activity compared to the pooled population of CD4+ T cells, including the core energy-producing pathways of glycolysis/gluconeogenesis, oxidative phosphorylation and fatty acid biosynthesis and degradation (Extended Data Fig. 5e–g and Supplementary Table 6). Meanwhile, the cytolytic cluster 4 remained metabolically stable with increased disease severity, despite increasing in percentages of the overall CD4+ population. Regulatory T cells (Treg cells), within metabolic cluster 1, exhibited mildly increased metabolism with increased severity, potentially countering increased proinflammatory activity. Overall, CD4+ T cells exhibit a similar pattern as CD8+ T cells in which a small subpopulation becomes metabolically dominant with increased activity with disease severity. Thus, the average increase in CD4+ T cell metabolic activity in more severe disease may be driven by this hyperactive subpopulation. On the contrary, the cytolytic CD4+ T cells, which are the most clonally expanded, increase as a fraction of the CD4+ T cell population but are relatively stable in per cell metabolic activity. Thus, different subpopulations of CD4+ T cells exhibit divergent metabolic activities with increasing disease severity, an observation missed by bulk analyses.

Metabolic uniqueness and dominance of plasma B cells

Next, we examined B cells and found that metabolic clustering and UMAP projections isolate plasma B cells as a metabolically and phenotypically distinct cell cluster (Supplementary Fig. 2a–c,g), consistent with activated antibody-producing cells having a distinct metabolic program23. Compared to non-plasma B cells, this plasma cell cluster (metabolic cluster 3) exhibited globally increased metabolism (Supplementary Fig. 2d–e), including positive correlation of amino acid metabolic activity with disease severity (Supplementary Fig. 2f,h and Supplementary Table 6), perhaps reflecting the antibody-secreting functional demands in severe disease.

A metabolically dominant NK cell subpopulation emerges

NK cells were similarly resolved into six metabolic clusters (Supplementary Fig. 3a–c), among which cluster 4 was distinguished as a proliferative, metabolically active phenotype by high levels of MKI67 and HIF1A and low levels of B3GAT1 (encoding CD57 protein) (Supplementary Fig. 3d,g). Indeed, this proliferative cluster had increased metabolic activity in key pathways such as folate biosynthesis, allowing for proliferation24, and sphingolipid metabolism, impacting immune regulation25 (Supplementary Fig. 3e,h and Supplementary Table 6). The metabolic activities of these proliferative cells also positively correlated with disease severity (Supplementary Fig. 3f).

Inflammatory and non-classical monocytes split metabolically

Monocyte dysregulation is prominently featured in COVID-19 (ref. 26). In fact, we found several metabolically defined monocyte clusters (Fig. 4a and Extended Data Fig. 6a–d) that also separate phenotypes (Fig. 4b). For instance, metabolic clusters 0 and 1, which are CD14+CD16HLA-DRmid monocytes highly expressing M1 macrophage-associated metabolic regulator STAT1 (ref. 27) and HIF1A (ref. 28), represent an inflammatory subpopulation consistent with previous clustering4 (Fig. 4c and Extended Data Fig. 6a–d). Cluster 2, consisting of CD14lowCD16+HLA-DRhigh cells, was composed of non-classical monocytes that were potentially immunomodulatory29. With more severe disease, the inflammatory monocytes increased not only in population percentage (Fig. 4d) (as reported in26) but also in metabolic activity per cell (Fig. 4e and Extended Data Fig. 6e). The non-classical monocytes decreased in both aspects (Fig. 4d,e).

Fig. 4: Inflammatory monocytes become metabolically dominant with increased COVID-19 severity.
figure4

a,b, Unsupervised clustering and UMAP of monocytes using the expression of only genes involved in known metabolic pathways. Clusters defined from expressions of only metabolic genes (a) and from expressions of all genes (b) are depicted. c, Summary of markers for defining inflammatory and non-classical monocyte subpopulations. Left, CD14, FCGR3A and HLA-DR expression per metabolically defined monocyte cluster. Clusters are labeled by number. S100A9high and S100A9low inflammatory monocytes and non-classical monocytes are specifically labeled. Cluster 9 is not shown due to its lack of valid gene expression data. Right, comparison of S100A9 expression between the two inflammatory monocyte subpopulations (metabolic cluster 0, n = 36,651 cells; 1, n = 33,242). Boxes indicate 25th, 50th and 75th percentiles. Whiskers indicate 1.5 interquartile ranges below and above the 25th and 75th percentiles, respectively. Statistical significance was calculated by two-sided Mann–Whitney U test; ***P < 0.001. d, Frequency of metabolically defined monocyte clusters per individual, grouped by WHO score-based disease severity categories: mild/healthy (WHO 0–2), moderate disease (WHO 3–4) and severe disease (WHO 5–7). e, Metabolic pathway activities that are significantly correlated with WHO score by two-sided Spearman’s rank correlation (P < 0.05; right) for all monocytes per individual, for only inflammatory monocytes (metabolic clusters 0 and 1) and for only non-classical monocytes (metabolic cluster 2), each in pseudobulk. Exact P values are in Supplementary Table 7. f, Fraction of monocytes per disease (peripheral monocytes in COVID-19, sepsis and HIV) that are S100A9high inflammatory, S100A9low inflammatory or non-classical. g, Venn diagram of metabolic pathway-related genes that are uniquely elevated in S100A9high inflammatory monocytes of each disease. h, Top five most highly enriched biological processes by gene ontology for metabolic genes uniquely elevated in S100A9high inflammatory monocytes of COVID-19 relative to sepsis and HIV. The x axis indicates –log10 (P value) for enrichment of each process. None of the displayed processes were significantly enriched in the sepsis or HIV conditions. Asterisks in g and h indicate the relation of the 417 genes in g to the biological processes in h.

Within inflammatory monocytes, cluster 0 is distinct from cluster 1 by higher expression of S100A9 transcript and HK2 (both confirmed at the protein level by flow cytometry) (Extended Data Fig. 7a–f). This suggests that the S100A9high subset may be responsible for the increased monocyte metabolic activity in severe COVID-19.

We compared the three major subpopulations (S100A9high inflammatory, S100A9low inflammatory and non-classical) in COVID-19 to other acute and chronic diseases (sepsis20 and HIV21) (Fig. 4f–h, Extended Data Fig. 8a,b and Supplementary Table 3). Intriguingly, the S100A9high inflammatory monocytes in COVID-19 uniquely highly expressed 417 metabolic genes enriched in mitochondrial protein synthesis-related processes, particularly respiratory chain assembly (Fig. 4h, Extended Data Fig. 8b and Supplementary Table 3), suggesting that there is increased oxidative phosphorylation for these monocytes in COVID-19.

Taken together, monocytes in individuals with COVID-19 diverge into two metabolically and functionally heterogeneous groups: an inflammatory, metabolically activated subpopulation and a potentially immunomodulatory, metabolically repressed non-classical subpopulation. These monocyte subpopulations further bifurcate by relative cell count and by metabolic reprogramming.

Integrated multiomics reveal cell-type-specific signatures

We next integrated the plasma metabolic signatures with metabolic pathways of each immune cell class and phenotype. To obtain a general view of such interactions, we generated networks of known gene–metabolite interactions that included transcripts in our metabolic pathways and plasma metabolites and proteins that were significantly correlated with disease severity (Fig. 5a and Extended Data Fig. 9a–c). Positively correlated metabolic transcripts and metabolites were integrated into gene–metabolite networks broadly classified into categories of lipid metabolism, nucleic acid modification and carbohydrate/glycan metabolism. Based on the resulting associations, we next asked whether metabolite levels correlated with cell-type-specific metabolic programs. Indeed, cell-type-specific metabolic pathways, especially among CD4+ T cells, B cells and monocytes, associated with unique signatures of metabolites (Fig. 5b). All three metabolic categories shown in Fig. 5a were found within the immune cell-type-specific metabolic changes shown in Fig. 5b, indicating the relevance of these metabolic categories’ networks to the COVID-19 immune response.

Fig. 5: Integrated analysis of metabolites and cell-type-specific pathway activities reveals cell-type-specific metabolite signatures in COVID-19.
figure5

a, Metabolic pathway-related gene expression was calculated at the whole-individual level, and transcripts significantly positively correlated with WHO score were integrated with plasma metabolites that also positively correlated with WHO score to create a gene–metabolite network in which genes are connected to metabolites in known metabolic pathways; PC, phosphatidylcholine. b, Selected correlations of plasma metabolites with peripheral immune cell-type-specific metabolic pathway activities. Correlation was calculated by two-sided Spearman’s rank correlation. Hierarchical clustering was performed for metabolites to resolve metabolite signatures; TCA, tricarboxylic acid.

The above associations suggested that integrating information from both plasma metabolites and immune cell metabolic pathway activities would yield a higher-resolution classification of individuals with COVID-19 compared to plasma metabolites alone. We thus performed PCA using both plasma metabolites and immune metabolic activities on a set of samples from individuals with COVID-19 whose single-cell transcriptomes were used to calculate metabolic pathway activities (Extended Data Fig. 9d) and then projected 242 samples that had both immune cell gene expression and plasma metabolite levels available onto the PC space. In contrast to our classification based on plasma metabolites alone (Extended Data Fig. 1b), even moderate and severe disease were distinguished. In fact, k-means clustering yielded three clusters separating individuals with mild (WHO score 1–2; cluster 0), moderate (WHO score 3–4; cluster 1) and severe (WHO score 5–7; cluster 2) disease (Fig. 6a,b and Extended Data Fig. 9e). Other clinical metrics, such as hospitalization status and level of respiratory support, were also well separated (Extended Data Fig. 9e). Furthermore, the top two PCs both correlated with clinical biomarkers from these individuals, including immune activation (cell counts) and inflammation (for example, C-reactive protein and fibrinogen) (Fig. 6c). PC2 correlated more strongly with markers of immune response to severe infections, such as peripheral immature granulocyte count. Both plasma metabolites and cell-type-specific metabolic pathway activities, especially the pathway activities that correlated with multiple metabolite patterns (Fig. 5b), contributed to PC1 and PC2 (Fig. 6d and Extended Data Fig. 9f), suggesting their potential roles in driving the overall immune response. Altogether, each immune cell class has unique metabolic signatures. Integrating cell-type-specific metabolic activities with plasma metabolite levels classified individuals with COVID-19 at higher resolution with respect to severity than what is achieved using metabolites alone.

Fig. 6: Integrated metabolomic analysis reveals high-resolution classification of individuals with COVID-19 and inputs for a machine learning model to predict clinical outcomes.
figure6

a, PCA of integrated plasma metabolites and metabolic pathways per individual with COVID-19. PCs were applied from the PCs that were calculated for the samples that had scRNA-seq data (Extended Data Fig. 9d). Resulting k-means clusters are colored, accompanied by contours and kernel density estimates of each cluster. b, Violin plot distribution of WHO scores per k-means cluster shown in a (k-means cluster 0: n = 111 samples; 1: n = 101; 2: n = 30). Within each violin, thick black lines are bounded by 25th and 75th percentiles. White dots indicate 50th percentiles. Thin black lines indicate 1.5 interquartile ranges below and above the 25th and 75th percentiles, respectively. Differences between the WHO scores of each k-means cluster were tested for statistical significance by two-sided Mann–Whitney U tests. c, Correlation of resulting PC1 and PC2 with clinical measurements per individual sample; BUN, blood urea nitrogen; GFR, glomerular filtration rate; RBC, red blood cell; WBC, white blood cell; APTT, activated partial thromboplastin time; INR, international normalized ratio; ALT, alanine aminotransferase; AST, aspartate aminotransferase. d, Loading scores of cell-type-specific metabolic pathway activities and plasma metabolites to PC1 and PC2. Pathway activities for only the cell types that exhibited strong metabolite signatures are shown (CD4+ T cells, B cells and monocytes); TCA, tricarboxylic acid. e, Workflow of machine learning model training and testing to predict survival versus death at a later timepoint using metabolite levels measured at an initial timepoint T1. f, Left, receiver operating characteristic (ROC) for iterations of the machine learning model to predict survival versus death, initially tested within random subsets of the INCOV dataset. Right, ROCs after applying the machine learning model to an independent cohort of samples from individuals with COVID-19; AUC, area under curve. For both plots, shaded bands indicate standard error of the mean ROC. g, Relative importance of each metabolite level or clinical information from a trained model, after feature selection, for predicting survival. h, Correlation of each selected metabolite with change in WHO score, T2 versus T1. Statistical significance was calculated by two-sided Spearman’s rank correlation with Benjamini–Hochberg FDR correction; *P < 0.05 (palmitoyl-linoleoyl-glycerol, P = 0.41; α-ketobutyrate, P = 0.019; 1-palmitoyl-2-oleoyl-GPE, P = 0.41; acetoacetate, P = 0.019; maltose, P = 0.41).

Identifying plasma metabolites to predict clinical outcomes

We next sought to identify a small number of key plasma metabolites measured near the time of clinical diagnosis (T1) to predict COVID-19 survival among individuals who were hospitalized. We first used our integrated plasma and immune cell class-specific analysis in Fig. 6a–d to select top metabolite contributors to each PC (Fig. 6d) as inputs for a machine learning model. A random forest classifier was used to refine a small panel of T1 metabolites for predicting survival within hospitalized INCOV individuals (Fig. 6e and Methods). We split our INCOV dataset into separate training and test sets with repeated randomized trials then used these datasets to train and test our model to predict death using levels of these selected T1 metabolites (Fig. 6f, left). Only five metabolite levels from T1 were sufficient to predict survival: palmitoyl-linoleoyl-glycerol (16:0/18:2), α-ketobutyrate, 1-palmitoyl-2-oleoyl-GPE (16:0/18:1), acetoacetate and maltose (Fig. 6g,h). To further validate this signature, we applied the model to an independent cohort of samples from individuals with COVID-19 collected and processed from individuals that were hospitalized at a different site by different investigators (Supplementary Table 4). Indeed, the predictive model predicted survival in the independent cohort with comparable accuracy (Fig. 6f, right).

Source link