Preloader

Single cell transcriptomic analysis reveals cellular diversity of murine esophageal epithelium

Ethical considerations

All research for the current study complies with all relevant ethical regulations. All murine studies were performed under a Temple University Institutional Animal Care & Use Committee-approved protocol (#5018).

Murine epithelial tissue collection and processing

Wild type C57Black6 mice (Cat# 000664) were purchased from Jackson Laboratories at age 10 weeks or 70 weeks. Mice were allowed to acclimate for at least 2 weeks prior to use for experiments. Mice were housed in individually ventilated caging racks on corncob bedding with 12:12 light-dark cycle. Mice had ad libitum access to filter-sterilized water and standard irradiated chow. Cages were changed every 2 weeks. Temperature was maintained at 68–72° F. Humidity was maintained between 30% and 70%. Whole esophagi were dissected from young (≤4 months; Range 12–13 weeks) and aged (≥19 months; Range 19–20 months) mice. For experiments using epithelium-enriched mucosal layer, muscle layer was physically removed using forceps then the esophagus was cut open longitudinally to expose epithelial surface. For single cell isolation, peeled esophageal epithelium of 5 young (12–13 weeks of age; 2 male, 3 female) and 5 aged (84–85 weeks of age; 2 male, 3 female) mice were incubated in 1 ml of 1X Dispase I (Corning 354235) in Hank’s Balanced Salt Solution (HBSS) (Gibco 14025-076) containing penicillin/streptomycin (1% v/v Gibco 15140-122), gentamycin (5 μg/ml, Apex 25–533), Fungizone (500 μg/ml, Genesis 25–541) for 10 min at 37 °C with shaking at 1,000 RPM (ThermoMixer F1.5 Eppendorf). Following removal from Dispase I, esophageal epithelium was chopped into 3 pieces with sharp scissors then incubated in 1 ml of 0.25% Trypsin-EDTA for 10 min at 37 °C with shaking at 1000 RPM. Trypsin and tissue pieces were forced through a cell strainer (70 μm) into a 50 ml conical tube containing 4 ml of soybean trypsin inhibitor (STI). Cells were pelleted at 1200 RPM for 5 min and pellets were then resuspended in 500 μl of complete mouse keratinocyte–serum-free medium (Gibco Cat# 37010022). Cell number and viability were measured by Automated Cell Count (Invitrogen Countess II FL) by mixing 10 μl of cell suspension with 10 μl 0.4% trypan blue solution (1:1). For single-cell experiments, at least 300,000 cells were isolated from each mouse, serving as individual biological replicates. Dead cells were removed by Miltenyi Biotec dead cell removal kit (Cat# 130-090-101) and OctoMACS starting kit (130-042-108) according to manufacturer’s instructions. Cells with 80–95% viability were used for single cell encapsulation. For downstream molecular studies, epithelium-enriched mucosal layer or whole esophagus were processed as described below.

scRNA library preparation and sequencing

The single cell droplets were generated with chromium single-cell controller using Chromium Next GEM Single Cell 3’ Kit v3.1 (10x Genomics, cat# 1000121). 5000–7000 cells were collected to make cDNA at the single cell level. Full-length cDNA with UMI was synthesized via reverse-transcription in the droplet. After PCR amplification and purification, cDNA was fragmented to around 270 bp and the Illumina adapters with index were ligated to the fragmented cDNA. After PCR, purification, and size selection, the single cell RNA libraries were 450 bp in length and sequenced on Illumina sequencer at R1 = 28 bp, R2 = 91 bp.

Deconvolution of scRNA-seq reads

Deconvolution of scRNA-Seq reads followed the 10X Genomics Cell Ranger (v6.0.0) pipeline48. Massively parallel digital transcriptional profiling of single cells was performed using the command ‘cellranger count with FASTQ files’ as input from each sample. For cellranger count, R1 and R2 were trimmed to 28 and 91 bp, respectively, to remove PCR adapters. The mouse genome mm10 (GENCODE vM23/Ensembl 98) was used as the reference for genome alignment and feature counting. From the output, the filtered matrices are used for downstream analyses.

Data filtering, integration, dimensionality reduction, and clustering

The matrices for each murine peeled epithelium sample were imported and transformed into Seurat (v4) objects for further processing. Cells with over 10% of their transcripts consisting of mitochondrial genes, over 3,000 unique genes, and over 10,000 total UMI were excluded to remove doublets or dead/dying cells. A total of 173,396 cell reads (replicate reads from 45,003 unique cells) passed this threshold for further analyses. Analysis of the filtered matrices follows the Seurat integration workflow described by Stuart, et. al.25 using the SCTransform function, which normalizes counts while accounting for read depth and subsequently searches for the top 2000 most variable feature per sample with the corrected counts for integration. Reciprocal PCA was then used to find integration transcript anchors between all of the matrices. Genes used for integration were ranked by the number of matrices they appear in. From this point on, dimensionality reduction used the genes and values that were pre-processed using the integration workflow. However, raw and normalized counts were stored for downstream differential expression tests. The resulting dataset was then reduced dimensionally via PCA, resulting in 30 principal components. An elbow plot was generated to see the standard deviations of each component, which verifies that the first 30 principal components contain most of the sources of variation in the dataset. To capture all the sources of variation in the dataset, all principal components were then used as input to the UMAP dimensionality reduction procedure (arXiv:1802.03426). Because of our interest in the relationship between cell cycle phases and cell fates, we opted not to regress cell cycle genes in our dimensionality reduction steps. A Shared Nearest Neighbor (SNN) graph was then constructed with the principal components of PCA by first determining the nearest neighbors for each cell and subsequently creating the SNN guided by the neighborhood overlap between each cell and its nearest neighbors. Clusters were then determined by a modularity optimization algorithm by Waltman and van Eck49. The initial clustering discovered non-epithelial populations that were subsequently excised, leaving in 172,283 cell reads from 44,679 unique cells identified as epithelial and used for further analyses. The remaining cells were then re-clustered with a repeat of the integration workflow described above. To choose an optimal clustering resolution, a clustering tree was generated with the R package clustree which depicts the movement of cells across clusters as resolution is increased from 0.1 to 1 with 0.1 increments. The resolution 0.2 was chosen, as it is the earliest resolution that created several clusters that were stable as the resolution was increased, as well as having a minimal number of clusters that were composed of multiple clusters from the next lowest resolution. Described dimensionality reduction and clustering procedures were also used for the murine organoid samples. Because of the presence of dead and dying cells, cells with <1500 unique genes were excluded from the dataset before dimensionality reduction and clustering. As a result, 9487 unique cells were used for dimensionality reduction and clustering for further analyses.

Cell cluster analyses

For each cluster, DEGs were calculated by comparing the expression of genes within the cells of the cluster over the expression of the genes in all other clusters. The statistical workflow to determine differential expression was Seurat’s implementation of the Wilcoxon’s signed-rank test. The significance cutoff for DEGs is a Bonferroni-adjusted p value of 0.05, and the fold change cutoff is below −0.25 or above 0.25 natural log fold change. To characterize the differential regulation of pathways in each cluste, DEGs that pass the cutoff from each cluster were exported into “IPA [http://www.ingenuity.com]” (Qiagen) for IPA core and comparison analysis. The Seurat function CellCycleScoring was used to predict the cell cycle of each cell. The function takes as input a list of S phase upregulated genes and G2/M upregulated genes, and outputs the score for each phase. The S and G2/M phase genes were provided within the Seurat package as objects “cc.genes.updated.2019$s.genes” and “cc.genes.updated.2019$g2m.genes”, respectively. The cell cycle phase is determined by the dominating score. Cells with weak scores for both phases are classified as G0/G1 phase cells. To compare each clusters’ proportional size between the different age groups, each sample’s cluster proportions were calculated, and a Wilcoxon signed-rank test was performed to compare the mean cluster proportions between the two age groups.

Pseudotime

For pseudotemporal inference, cells analyzed with Seurat were exported to Monocle 3. Pre-processing in Monocle 3 follows the methodology outlined by Cao et al.26, which includes a batch correction step treating each mouse as a different batch to find a commonly shared reduced dimension. To verify that the cluster stratification of our primary murine epithelial dataset was reproducible, we used Monocle’s default parameters without any changes. Monocle 3’s procedure results in a UMAP structure, from which a trajectory graph was inferred. To model how epithelial cells cycle and assume different cell fates, the cells of basal cluster 6 prior to entrance into the S phase of the cell cycle were chosen as the root population. Finally, to further elucidate the biological processes that govern epithelial proliferation and differentiation, Monocle 3 was also used to find modules of co-expressed genes.

Bulk RNA-seq comparison of epithelial and stromal tissue

To identify DEGs between epithelial and stromal tissue, FASTQ files from both epithelial and stromal bulk RNA-Seq experiments were first aligned to the GRCm38.p6 mouse genome from GENCODE using the library Rsubread on R50. The resulting BAM files were then summarized at the gene level and counted using Rsubread’s featureCounts functionality, producing a counts matrix for all the epithelial and stromal samples and their gene counts. The count matrix was then used to compare epithelial and stromal gene counts using the library DESeq2 in R51 with an alpha of 0.05.

In situ studies

RNA fluorescence in situ hybridization (FISH), immunohistochemistry (IHC), and immunofluorescence (IF) were performed in formalin-fixed paraffin-embedded murine esophageal specimens. IHC was performed for COL17A1 (Invitrogen, MA5-24848l, Clone 2C3 1:100), ATP1B3 (Abcam, ab137055; Clone EPR8981, 1:100) and CNFN (Novus Biologicals, NBP2-14668; 1:100). Slides were counterstained with Hematoxylin and imaged on a Leica DM30 microscope at 400X magnification. IF was performed for KRT14 (NeoMarkers, MS-115-PABX; Clone LL002; 1:200). Slides were counterstained with DAPI and imaged at 200X magnification. IHC and IF were performed using standard protocols39,43. RNA FISH was performed using RNAscope technology (Advanced Cell Diagnostics) following the manufacturer’s protocol and RNAscope probes for murine Krt5, Krtdap, positive control, and negative control. Slides were counterstained with DAPI and imaged on a Leica SP8 confocal microscope at 400X magnification.

Immunoblotting

2 × 105 primary murine epithelial cells were seeded in 6-well plates. After 72 h, cells were treated with 0.018 or 0.6 mM CaCl2 for an additional 72 h. Cells were lysed in cell lysis buffer (Cat# 9830 S, Cell Signaling Technology) containing protease/phosphatase inhibitor cocktail (Cat# 5872 S, Cell Signaling Technology). Protein concentration was determined by Qubit™ protein assay kit (Cat# Q33211, Invitrogen). Protein samples were solubilized in NuPAGE™ LDS Sample Buffer (Cat# NP0007, Invitrogen) and denatured with NuPAGE™ sample reducing agent (Cat# NP0009, Invitrogen) containing 50 mM dithiothreitol. 30 μg of denatured protein was fractionated on NuPAGE™ Bis-Tris 4–12% gel (Cat# NP0335BOX, Invitrogen). Following electrotransfer, Immobilon-P PVDF membranes (Cat# IPVH00010, Millipore Sigma) were blocked in blocking buffer containing 5% nonfat milk (Cat# LP0031B, ThermoFisher Scientific) in PBST (PBS and 0.1% Tween 20) for 1 h at room temperature. Membranes were then incubated overnight with primary antibodies diluted in blocking buffer and then with the appropriate HRP-conjugated secondary antibody for 1 h at room temperature. β-actin served as a loading control. A list of antibodies with dilutions used is provided in Supplementary Table S1.

3D organoid assays

Murine esophageal 3D organoid formation assays were performed on freshly isolated primary murine epithelial cells (PMECs)43. Briefly, a single cell suspension of PMECs in keratinocyte serum free medium was mixed with 90% Matrigel. For each well of a 24-well plate, 500–1000 cells in 50 μl Matrigel were seeded to initiate 3D organoid formation. After solidification, 500 μl of advanced DMEM/F12 supplemented with 1X Glutamax, 1X HEPES, 1X penicillin-streptomycin, 1X N2 Supplement, 1X B27 Supplement, 0.1 mM N-acetyl-L-cysteine, 50 ng/ml human recombinant EGF, and 2.0% Noggin/R-Spondin-conditioned media was added and replenished every 3–5 days. At the time of plating, 10 μM Y27632 was added to the culture medium. Organoids were grown for 15 days before recovering from Matrigel with Dispase I. Then, organoids were dissociated in 1 ml of 0.25% Trypsin-EDTA for 1 h at 37 °C with shaking at 1000 RPM. Trypsin and cells were forced through a cell strainer (70 μm) with 4 ml of 250 μg/ml STI in 1X PBS. Cells were pelleted at 1000 RPM for 5 min then resuspended in 500 μl of complete mouse KSFM. Cell number and viability were measured by Automated Cell Count.

Imputation of cell populations in scRNA-seq data from human biopsy specimens and 3D murine

Esophageal organoids

Human epithelial scRNA-Seq data was obtained from Nowicki-Osuch and Zhang et al.24. A reference dataset for the normal human esophagus was made publicly available on the “Esophagus Cancer Atlas [https://www.esophaguscancercellatlas.org]”, from which the R object NE.rds was downloaded. For each cell in either the murine esophageal organoid or human epithelial dataset, an imputation was done to infer which of the cell populations in our integrated murine peeled epithelium data is analogous. This is done with Seurat’s label transfer pipeline (FindTransferAnchors and TransferData), which finds anchors between the reference integrated murine epithelium dataset and each query dataset and subsequently transfers the cell population labels onto the cells in the query datasets. Population labels are chosen based on which of the reference population has the max prediction score (a scale ranging from 0 to 1 signifying confidence for each label) for each of the query cells.

Mitochondrial assays

The activity of mitochondria Complex I was measured in peeled murine epithelium-enriched mucosal layer using Complex I Enzyme Activity Assay kit (Abcam, ab109721) according to the manufacturer’s instructions. Briefly, epithelium-enriched mucosal layer from young and aged mice was suspended in 500 µl chilled PBS and completely homogenized using a Dounce homogenizer with 20–40 passes. Protein lysis buffer was added to the tissue for protein extraction followed by 30 min ice incubation to allow solubilization. Samples were centrifuged at 16,000 x g for 20 min at 4 °C. Supernatant was collected as tissue lysate and diluted to a desired concentration after protein estimation. Tissue lysate was added to 96-well microplates precoated with capture antibodies specific for Complex I. Once target was immobilized, Complex I activity was determined following the oxidation of NADH to NAD and the simultaneous reduction of a dye. Absorbance was measured at OD = 450 nm using a spectrophotometer. mtDNA level was measured by qPCR of DNA from epithelium-enriched mucosal layer. DNA isolation was performed using DNeasy Blood and Tissue Kit (Qiagen Cat# 69506) according to the manufacturer’s instructions. qPCR was performed using PowerUp SYBR green master mix (Thermo Fisher) with the following primers: Ikbβ For: GCTGGTGTCTGGGGTACAGT Rev: ATCCTTGGGGAGGCATCTAC, and mtDNA D-Loop Fwd:ACTATCCCCTTCCCCATTTG Rev: TGTTGGTCATGGGCTGATTA. The relative fold change between samples of mtDNA D-loop was calculated with normalization to the nuclear encoded Ikbβ.

Statistics

Descriptive statistics are presented as mean ± standard error of the mean or median (minimum–maximum) for continuous variables and frequency counts (percentages) for categorical variables. Two-sample t test or Wilcoxon rank-sum test and one-way analysis of variance or Kruskal–Wallis test comparing two and three groups, respectively for continuous variables and chi-square test or Fisher’s exact test for categorical variables were used.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Source link