Identification of differentially expressed tumor stemness-related genes
Using the cutoff of |logFC|> 1 and the p-value < 0.05, 13,729, 4,718, and 3,817 differentially expressed genes (DEGs) were obtained via differential expression analysis on high and low mRNAsi groups (Fig. 1A), high and low EREG-mRNAsi groups (Fig. 1B), tumor and normal groups (Fig. 1C), respectively. After intersecting DEGs from three ccRCC-related datasets, 732 tumor stemness-associated DEGs in ccRCC were eventually identified (Fig. 1D).


Identification of stemness indices-related DEGs. Notes: Volcano plots for DEGs in high-mRNAsi and low-mRNAsi ccRCC samples from (A), high-EREG-mRNAsi and low-EREG-mRNAsi ccRCC samples from (B), and tumor and normal tissues (C). (D) Venn diagram for DEGs from 3 different data sets. DEGs: differentially expressed genes; ccRCC: clear cell renal cell carcinoma.
Function and interaction analysis of tumor stemness-related DEGs
Gene ontology (GO) enrichment analysis and protein–protein interaction (PPI) analysis were conducted to explore the potential biofunction of 732 stemness-related DEGs in ccRCC and unravel the intrinsic roles of these genes. GO annotation analysis revealed that DEGs were tightly associated with the construction of extracellular matrix (ECM) structure in respect of cellular component, molecular function, and biological process (Fig. 2A). By calculating the score of nodes, the most significant modules with 15 nodes (Fig. 2C) and 30 nodes (Fig. 2D) from the PPI network (Fig. 2B) showed potential hub genes and interactions between them.


Function analysis on DEGs. Notes: (A) GO enrichment analysis of 732 DEGs, BP: biological process; CC: cell component; MF: molecular function. (B) PPI network of 732 DEGs after removing nodes without edges. (C) Top 30 nodes of PPI network. (D) Top 15 nodes of PPI network. GO: gene ontology; PPI: protein–protein interaction.
Construction, evaluation, and validation of the stemness indices-based 6-gene risk signature
A total of 529 ccRCC patients with complete follow-up information in the TCGA-KIRC database were randomly divided into training and validation cohorts at a ratio of 1:1 to establish a prognostic prediction model for ccRCC patients based on the stemness indices (mRNAsi and EREG-mRNAsi). First, univariate cox analysis was performed to identify prognosis-related genes with a p-value threshold < 0.05. Then, Random Forest (RF) algorithm was applied to minimize the dimension and calculate the relative importance score of these prognosis-related genes. Consequently, each gene was given a score and the top six genes with the highest scores were associated with the OS of patients (Fig. 3A). Then, they were used to establish a risk signature for the prognosis of ccRCC patients after multivariate cox analysis. In the 6-gene risk signature, the “∑coef ∗ Exp(genes)” formula used to calculate the risk score for each patient was: Risk Score = 0.133 * AC010973.2 expression + 0.182 * AP001029.2 expression + (−0.271) * RNU6-125P expression + (−0.025) * Z98885.1 + (−0.114) * KDM5C-IT1 expression + (−0.175) * AL021368.3. Based on the median risk score in the signature, patients from training (Fig. 3B) and validation (Fig. 3C) cohorts were equally divided into low‐risk and high‐risk groups. Diagrams of risk score (left top), survival status (left medium), the heat map of gene expression (left bottom), time‐dependent ROC curves (right-top), and Kaplan–Meier survival curves (right bottom) are shown in the figures. Time-dependent ROC curves showed AUC of the prognostic 6-gene signature of 0.67 at 1 year; 0.79 at 3 years and 0.73 at 5 years in the training cohort (Fig. 3B), and a similar outcome of 0.66 at 1 year, 0.71 at 3 years, and 0.7 at 5 years in the validation cohort (Fig. 3C). Survival analysis revealed a huge advantage OS of the patient in the low-risk group compared to the high-risk group both in training (p < 0.0001, Fig. 3B) and validation groups (p < 0.0001, Fig. 3C).


Establishment and assessment of the risk signature for ccRCC. Notes: (A) Calculation of relative importance of each DEG in the prognosis of ccRCC patients using Random Forest algorithm. In (B) Training cohort and (C) Validation cohort, diagrams are risk score map (left top), survival status map (left medium), gene expression heat map (left bottom), time‐dependent ROC curves (right top) and Kaplan–Meier survival curves (right bottom). ROC: receiver operating characteristics.
Clinical parameters-based model and comparison between two prognosis prediction models
By assigning the OS as the outcome, a nomogram was constructed based on the clinical features and risk score of ccRCC patients (Fig. 4A). As shown in Fig. 4A, age (p < 0.001), stage (p < 0.001), T stage (p < 0.05), and risk score (p < 0.001) revealed a significant relationship with OS. To compare the capacity of the clinical prediction model and 6-gene signature in prognostic prediction for ccRCC patients, decision curves analysis (DCA) was conducted by labeling model of risk signature as ‘Riskscore’, clinical prediction model as ‘Clinical’ and combination of two models as ‘Combined’ (Fig. 4B). A noticeable distinction of Net benefit (NB) between ‘Riskscore’ and ‘Clinical’ curves were noted when threshold probability (Pt) ranged from 0.2 to 0.5, suggesting the superiority of gene signature over the clinical model bringing medical benefits for the whole cohort under prediction. By comparing the blue and yellow curves, we confirmed the predictive capacity of the combined model slightly preceded the gene signature alone. Assessment of the model for efficacy in predicting 1-year (Fig. 4C), 3-year (Fig. 4D), and 5-year (Fig. 4E) survival probability was performed by measuring calibration through bootstrapping 1000 resamples (Gray = ideal).


Decision Curves Analysis (DCA) of clinical model and risk signature. Notes: (A) Nomogram of clinical parameters-based prediction model for prognosis of patients. (B) Decision curves for different predicting models. Horizontal line: assume all patients as low risk (Pi < Pr); Green curve: assume all patients as high risk (Pi > Pr); Red curve: clinical prediction model; blue curve: gene signature; Yellow curve: combination use of two models. Calibration curves for nomogram in predicting survival probability at 1 year, 3 years and 5 years. The calibration measurement was conducted through bootstrapping 1000 resamples in TCGA-KIRC database (C-E).
Clinical relevance and potential bio-functional pathways of hub gene AC010973.2
AC010973.2 gene was selected for further research because of its significance in the risk signature (Fig. 3A). Clinical correlation analysis demonstrated no meaningful relationships between AC010973.2 expression and age, gender, pathological grade, and M stage. Nevertheless, the AC010973.2 value was significantly upregulated in individual stage III-IV compared to stage I-II (p < 0.05), and in T stage III-IV compared to T stage I-II (p < 0.05) (Fig. 5A). The results revealed that AC010973.2 potentially participates in the proliferation and growth of the tumor as a pro-cancer gene.


Clinical correlation analysis and GSVA of AC010973.2. Notes: (A) Violin plots for distribution of AC010973.2 value in different groups of diverse clinicopathological features in XXX patients from KIRC database. (B) Diagram of GSVA on AC010973.2. The collection used was H-hallmark gene sets. Y-axis represented diverse hallmark gene sets. X-axis represented gene sets expression level in high (yellow bars) and low (blue bars) AC010973.2 groups. (C) ROC curves for AC010973.2 alone to predict 1-year, 3-year and 5-year OS of ccRCC patients. AUC for them were exhibited in the plot.
To explore the associated pathways involving AC010973.2, Gene Set Variation Analysis (GSVA) analysis was used to detect signature differentiation between high and low expression groups. Several representative cancer pathways were tightly related to AC010973.2, including E2F targets, IL6-JAK-STAT3 signaling, apoptosis signaling, hypoxia signaling, and fatty acid metabolism (Fig. 5B). Further, AC010973.2 was applied as an independent predictor for the OS of patients in ccRCC; the ROC curves showed the AUC of 0.645, 0.616, and 0.677 for survival probability at one year, three years, and five years, respectively (Fig. 5C).
Upregulation of AC010973.2 expression in ccRCC tissues and cell lines
Based on 13 paired ccRCC tissue, qPCR results revealed prominent upregulation of AC010973.2 mRNA expression in tumoral tissues (p < 0.01, Fig. 6A); this was also verified in cell level (Fig. 6B). ACHN and Caki-1 were transfected with siRNA-AC010973.2 for further research due to their highest expression level of AC010973.2. The knockdown of AC010973.2 was confirmed by mRNA expression detection (Fig. 6C).


Upregulation of AC010973.2 mRNA expression in ccRCC. Notes: (A) AC010973.2 mRNA expression in tumors and non-malignant tissues from ccRCC patients. (B) AC010973.2 mRNA expression in ccRCC cell lines and normal renal tubular epithelial cells. (C) mRNA expression in AC010973.2 knockdown cell lines.
AC010973.2 promoted tumor proliferation through multiple oncogenic pathways in ccRCC
Based on GSVA analysis, E2F targets, JAK/STAT3, G2/M checkpoints, apoptosis signaling pathways showed significant up-regulation in the high AC010973.2 group (Fig. 5B). Notably, Caspase-3 acts as one of the most vital terminal operators of apoptosis in cells, and Bcl-2 and Bax are pivotal regulators of Cas-3. After silencing AC010973.2 in ACHN and Caki-1 (Fig. 6C), the expression of the three proteins was detected by Western Blot. The protein expression of Cas3 and Bax exhibited an apparent increase while Bcl-2 was distinctly downregulated, both in si-ACHN and si-Caki-1 cell lines, compared to ACHN and Caki-1 cell lines (Fig. 7A-B).


AC010973.2 regulated apoptosis signaling pathway and promoted cell proliferation. Notes: After the transfection with siRNA-AC010973.2, protein levels of Caspase-3, Bax and Bcl-2 in ACHN (A) and Caki-1 (B) cell lines through Western Blot; the blots were cut prior to hybridization with antibodies for saving antibodies. (C) Colony formation assay of ACHN and Caki-1 after the knockdown of AC010973.2. (D) CCK8 assay of ACHN and Caki-1 after the knockdown of AC010973.2.
Colony formation assays revealed that the colonies of the si-AC010973.2 group were significantly less compared to that of the control group in both ACHN and Caki-1 cell lines (Fig. 7C), suggesting that AC010973.2 knockdown weakened the ability of tumor cells. Moreover, CCK8 assays demonstrated that AC010973.2 knockdown profoundly inhibited the proliferation capacity of croc cell lines (Fig. 7D). Altogether, these results indicate that AC010973.2 might promote tumor growth and survival by regulating multiple oncogenic pathways in ccRCC.

