Preloader

Spatial transcriptomics using combinatorial fluorescence spectral and lifetime encoding, imaging and analysis

MOSAICA workflow

In a typical MOSAICA workflow (Fig. 1), primary oligonucleotide probes designed to specifically bind to mRNA targets with a complementary target region (25–30 base long) are incubated with fixed cell or tissue samples (Fig. 1a, b). These primary probes also contain an adjacent adaptor region consisting of two readout sequences for modular secondary probe binding. In this study, double-ended secondary probes with fluorophores on each end are hybridized to the readout region on the primary probes (Fig. 1c). Through combinatorial labeling, each target is encoded with a dye with a distinct spectrum and lifetime signature. The labeled samples are then imaged using a custom built or commercial microscope (e.g., the Leica SP8 Falcon used in this study) equipped with spectral and lifetime imaging capabilities (Fig. 1d). Both spectral and fluorescence lifetime data will be captured, and then analyzed using phasor plots (Fig. 1e). Our automated machine learning algorithm and a codebook finally reveal the locations, identities, counts, and distributions of the present mRNA targets in a 3D context (Fig. 1f).

Probe design pipeline

To rapidly design oligonucleotide probes for the transcript of each gene, we modified the python platform, OligoMiner28, a validated pipeline for rapid design of oligonucleotide FISH probes. Briefly, as shown in Supplementary Fig. 1a, using the mRNA or coding sequence file of the target gene, the blockParse.py script will screen the input sequence and output a file with candidate probes while allowing us to maintain consistent and customized length, GC, melting temperature, spacing, and prohibited sequences. Using Bowtie2, the candidate probes are rapidly aligned to the genome to provide specificity information that is used by the outputClean.py script to generate a file of unique candidates only. The primary probes comprise complementary sequence of typically 27–30 nucleotides and are designed mostly within the coding sequence region, which has fewer variation than the untranslated region20. We wrote a script, seqAnalyzer.py, to automate the alignment of primary probes to sequencing data (Supplementary Fig. 1b) so that probes that aligned to regions of lower read counts would be discarded. Furthermore, primary probe “readout” domains and secondary probes (typically 15–20 nucleotides long) are designed to be orthogonal to each other to avoid off-target binding. Libraries and databases of over 200,000 orthogonal sequences are available online and we have simply used those that have been previously validated29. Fluorophores exhibiting distinct spectrum (typically with excitation/emission spectra in the 400–700 nm range) and lifetimes (typically in the 0.3–10 ns range) can be conjugated to oligos which were obtained through commercial vendors (see Methods).

Probe labeling validation and optimization

We first investigated the specificity of our labeling condition using a simple cell mixture model comprising wild-type HEK293T-X cells and HEK293T-X cells engineered with mNeonGreen (Supplementary Fig. 2a) by detecting mNeonGreen mRNA as the gene expression target. Since only fluorescent mNeonGreen positive cells can express the corresponding mRNA transcripts, this cell mixture model provides a straightforward tool to assess the specificity and nonspecific binding. Using a Nikon epifluorescence microscope to image the samples following staining with primary and secondary probes (all probe sequences used in this study are provided in Supplementary Data 1), we detected on average 43.5 puncta per mNeonGreen positive cell (n = 76 cells) and 0.25 puncta per wild-type cell (n = 164) (Supplementary Fig. 2b, c), indicating minimal nonspecific binding with our probe labeling strategy. To further validate the baseline level of nonspecific binding, we included a negative control with the primary probe designed toward dopachrome tautomerase, a gene in the mouse genome that is not expressed in our HEK293T-X model system, along with a condition with secondary probes only. Similarly, an average of 43.5 puncta per cell was detected for the mNeonGreen cells while the wild type and negative controls a mean of 2.5 puncta per cell was detected with a lower signal-to-noise. We next optimized labeling efficiency by testing the number of primary probes and incubation times of primary probes and secondary probes (Supplementary Fig. 3). We determined our optimal condition to comprise a minimal of at least 12 primary probes for each target mRNA (in practice, we always maximize the number of primary probes per mRNA depending on the size of mRNA). Indeed, 40 primary probes per channel per mRNA were subsequently used in this study, with incubation time of 16 h for primary probe hybridization and 1 h for secondary probe hybridization, respectively, which were used in subsequent experiments.

Imaging and phasor analysis

Lifetime imaging is a tool that measures the spatial distribution of probes with different fluorescence lifetime. Samples are stimulated with modulated or pulsed lasers at a particular frequency, typically around the 40–80 MHz, which allows the fluorescence to decay within the stimulated period, typically in the ns range. After acquiring for sufficient time, i.e., after enough laser pulses or periods, one can construct a histogram of photon arrival times at each pixel. The shape of this histogram has a rapid rise, followed by a faster or slower decay which is characteristic of the fluorescent molecule(s) present in the pixel. To model this decay data, an exponential decay model can be fitted or alternatively one can make use of the fit-free phasor approach30,31. We used this second approach because it requires no a priori knowledge of an underlying model (i.e. number of fluorescent species at the pixel) and it is computationally inexpensive in virtue of the Fast Fourier Transform algorithm. The phasor transform extracts two values from the decay curve that characterize the shape (and importantly not the size, so that the transform is independent of the amount of photons) and these two values, namely S and G, correspond to the two coordinates of the pixel on the phasor plot (see equations in Supplementary Note 1). The values are obtained by an integral of the product of the decay of the two trigonometric functions, sine and cosine, fit in the stimulation period, and they correspond to the first-order terms of the Fourier Series decomposition of the decay curve.

Similarly, if one uses a spectral detector, i.e., a separate detector for different spectral bands, then for each pixel, one can obtain another histogram, in this case with the number of photons arriving in each channel, i.e., at each wavelength. This curve can also be transformed to an analogous spectral phasor space to map the recorded spectra at each pixel onto the 2D spectral phasor space32,33. Combining the lifetime measurement with a spectral detector, one effectively has a 5-dimensional space in which to characterize each pixel. On top of the spatio-temporal coordinates (x,y,z,t), each pixel now carries information in five additional coordinates: its intensity value (however many photons arrived at that pixel), the two phasor coordinates for the lifetime phasor transform, and the two phasor coordinates for the spectral phasor transform34. A typical image, on the order of 106 pixels, obtained with this method provides 106 points in this 5D space34. If the sample presents different populations of fluorescent molecules at different locations, the pixel phasor data at these different locations map to different positions in this phasor space and a clustering technique can be used to resolve each population35.

There is a direct analogy between the phasor transform in spectral and lifetime fluorescence microscopy (Fig. 2). As an example in this figure, we use a hypothetical experiment where transcripts from 4 different target genes are targeted with 4 fluorescent species. Of the 4 species, we construct the example so that two fluorescent species emit in one color and the other two in another color. At the same time, within each color, one has a short lifetime and the other has a long lifetime. This hypothetical sample is excited, and the individual photons are detected at each pixel (Fig. 2a). In each pixel, we accumulate enough photons to build a spectral histogram and a lifetime histogram (Fig. 2b). These curves are phasor-transformed to reveal two distinct populations in the phasor space, corresponding to the two colors and the two lifetimes. By means of our previously published automatic clustering using machine learning35, we identify these populations and return to the image space to label each pixel depending on the group it belongs to in the phasor space (Fig. 2c). By combining the spectral and lifetime information, we have automatically segmented the image into regions, i.e., identified the pixels that belong to the different species (Fig. 2d). Again, note that in this example in Fig. 2, we have chosen the probes to be the most convoluted case possible; one couple shares a similar spectrum and the other couple shares another spectrum. At the same time, one of the members of either couple share a similar lifetime and the other two members of either couple share another lifetime. This is the reason why even if there are four distinct fluorescent probes, only two spectral populations are detected both in the spectral and lifetime phasor space, and the combinations of these two populations yield to the four distinct groups. The four probes cannot be resolved unless both the lifetime and spectral information are accessed.

Fig. 2: Image and phasor analysis with spectrum and lifetime analysis in MOSAICA.
figure2

a As an example, four different probes are used to target the transcripts of four different genes. The fluorescence is collected using the spectral and Fluorescence Lifetime Imaging and Microscopy (FLIM) instrument to form images where each pixel carries information of the spectra and lifetime. b At each pixel we compute the photon distribution in the spectral and temporal dimension. The phasor transform maps these distributions in each pixel to a position on the phasor space. c The phasor plots reveal the presence of different populations. These populations are identified and then mapped back to the original image. d We color code the pixels based on the combination of the two properties. This allows us to separate by lifetime probes that were emitting with similar spectra and vice-versa, separate by spectra probes that fluoresce with similar lifetimes.

Combinatorial target spectral and lifetime encoding and decoding

In the previous section, we showed how by combining the time dimension with the spectral dimension, we can increase the number of possibilities and therefore enhance the multiplexing capabilities squaring the number of targets that can be resolved. To further increase multiplexing and improve detection efficiency, we employ combinatorial labeling, a method in which targets are labeled with two or more unique fluorophores, to greatly increase the base number of targets we can label with a given number of fluorophores/probes. To illustrate this concept, here we demonstrate a minimal exemplary working example of combinatorial labeling where two probes are used to label three targets. In this situation, each probe labels one target and the third target is labeled with both probes simultaneously. Figure 3 shows a real case with such configuration, both for spectra and for lifetime. The cartoon represents the case of using two probes with distinct spectra. When imaging this sample, we can use two spectral channels, Fig. 3b, c, where some targets appear in only one channel, other targets appear in only the other channel and the target that is labeled with both probes appears in both channels. All targets are then detected and color-coded depending on their presence in one channel, the other or the two simultaneously (Fig. 3d) and the overall counts of each combination in the field of view can be provided (Fig. 3e).

Fig. 3: Working example of combinatorial labelling of three mRNA targets with two probes.
figure3

a Transcripts of three different target genes are tagged using two probes with different spectra. Targets 1 and 3 are tagged each with one probe, Target 2 is tagged with both simultaneously. b, c The fluorescence is collected in the two expected spectral channels for the known emission of the two probes (representative small regions of a whole 3D field of view). d The maximum projection of the two channels is shown and pseudo-colored depending on the presence in the respective channels (as an inset within the whole field of view. e The actual counts of each target within the whole field of view. f As a parallel example, transcripts of three different target genes are tagged using two probes with different lifetime. Targets 1 and 3 are tagged each with one probe, Target 2 is tagged with both simultaneously. g The phasor plot presents three populations, corresponding to the pixels with the three combinations; the two components by themselves plus the linear combination falling in the middle. h) Machine learning clustering technique is used to identify the groups (Gaussian mixture model). i) The multicomponent method is used to extract the fraction of one of the components in each detected puncta. j The same inset is shown with the pseudocoloring now depending on the lifetime clustering. k The counts for each lifetime cluster in the whole field of view. l The combination of the information in both the spectral and the lifetime dimension yields a final 6-plex. m The overall counts for the 6-plex detection of transcripts including POLR2A (Alexa647 & ATTO565), MTOR (ATTO647 & ATTO565), KI67 (Alexa647 & ATTO647), BRCA1 (Alexa647), NCOA2 (ATTO647), NCOA3 (ATTO565) with the appropriate expressed genes that correspond to each combination. Experiments were conducted with cultures of mNeon green cells. Scale bar 10 µm in large image and 2 µm in insets. Source data are provided as a Source Data file.

Similarly, we show a case in which the targets are now labeled with two probes that have similar spectra but different lifetimes (Fig. 3f). In this case, we also introduce the use of the phasor approach to reveal the three expected populations, the pixels that contain both probes appear in the midpoint between the phasor positions of the pixels that contain only one of the probes. Figure 3g shows the phasor distribution obtained from the same field of view as in the spectral example, in which we also show the theoretical locations of the probes (corresponding to Alexa647 and ATTO647 with respective lifetimes of 1 ns and 3.5 ns). As is expected in real experimental conditions, there are additional fluorescent components in the sample. We broadly refer to the bulk of these additional components as autofluorescence, which pulls the data away from the expected positions and converges to the mean phasor position of the autofluorescent components. We have previously shown that the Gaussian Mixture Models is the most optimal machine learning clustering algorithm to model phasor data35, and we use this machine learning technique to infer the phasor locations of the probe combinations (Fig. 3h). We can now successfully classify each pixel of the original image into one of the clusters and obtain a probability of belonging to each, i.e., the posterior probability of the model. This allows us to color code the transcripts depending on their assignment to one of the three clusters (Fig. 3j) and obtain the counts of the three-lifetime components (Fig. 3k). Additionally, we apply our lifetime multicomponent analysis technique36 in which for each detected puncta, we estimate the presence of one of the lifetime components, in this case lifetime1 (Alexa647, purple in the figure), to obtain the expected result; that there are clearly three populations with respective fractions centered around [0, ½, and 1] (Fig. 3i).

In the general case, we combine the lifetime and spectral dimensions, and we perform the clustering of the data in a 4D spectral/lifetime phasor space. The clustering technique has the power to not only identify which puncta belong to each cluster but also to assign a probability of belonging to that cluster, which can be used to quantify the certainty of the labeling. For example, in the inset in Fig. 3j, we show two cases of puncta that have relatively low confidence in the cluster assignment; they are depicted with blended colors because they fall in the regions of the phasor space where the two clusters are merging.

In this combinatorial example in Fig. 3, the three clusters in the lifetime domain multiplexed with the channel-based in the spectral domain yield a 6-plex image using only 3 probes (Fig. 3l, m). The specific transcripts for genes targeted for this experiment with the combined probes were POLR2A (Alexa647 & ATTO565), MTOR (ATTO647 & ATTO565), KI67 (Alexa647 & ATTO 647), BRCA1 (Alexa647), NCOA2 (ATTO647), NCOA3 (ATTO565). In the general combinatorial experiment using couples of N probes the total number of possible target genes grows quadratically:

$$left(begin{array}{c}N\ 2end{array}right)=frac{N!}{2(N-2)!}=frac{{N}^{2}-N}{2}$$

(1)

Simultaneous 10-plex mRNA detection in fixed colorectal cancer SW480 cells using MOSAICA

We next applied MOSAICA to a 10-plex panel of mRNA targets in colorectal cancer SW480 cell culture samples. This cell line was chosen because its xenograft model exhibits spatial patterns of heterogeneity in WNT signaling37, which will allow us to study tumorigenesis in the spatial context and potentially identify cancer stem cell populations in colorectal cancer in future studies. Here, we selected this model as a validation platform to demonstrate the multiplexing scalability and error-detection capabilities of our approach. We began by first identifying a set of 10 genes with known expression levels from our bulk sequencing data. Using the aforementioned probe design pipeline, we designed 80 probes (two pairs of 40 probes) for the transcript of each gene: BRCA1, BRCA2, CENPF, CKAP5, POLR2A, KI67, MTOR, NCOA1, NCOA2, and NCOA3. These genes were chosen due to their housekeeping status or involvement in tumorigenesis in colorectal cancer. By encoding the transcript of each gene with a distinct combination of two fluorophores, we generated a codebook of 10 labelling combinations from only five fluorophores following Eq. 1: (left(begin{array}{c}5\ 2end{array}right)=10) (Fig. 4a) (see Supplementary Table 1 and Supplementary Table 2 for the fluorophores and probes, respectively, used for each target). To assess the baseline nonspecific binding events of our assay, we included a negative probe control sample, which was labelled with primary probes not targeting any specific sequence in the human genome or transcriptome but still containing readout regions for secondary fluorescent probes hybridization (Fig. 4a, right). Matching numbers and concentrations of primary and secondary probes that were used in the 10-plex panel were used in this sample.

Fig. 4: Simultaneous 10-plex detection of transcripts for genes in colorectal cancer SW480 cells in a single round of labeling and imaging.
figure4

a 10 different gene transcripts are labeled with primary probes followed by respective and complementary fluorescent secondary probes. Each transcript is labeled with a combination of 2 out of 5 fluorophores for 10 combinations. Negative control probes (mNeonGreen, DCT, TYRP1, and PAX3) targeting transcripts not present in the sample were used with their respective secondary fluorophore probes. b Spectral image (max-projection in z) of a field of view of the labeled 10-plex sample (5-channel pseudo coloring). c Lifetime image (max-projection in z) of a field of view of the labeled 10-plex sample (phasor projection on universal circle pseudo coloring). d Spectral image of the labeled negative control probe sample. e Lifetime image of the labeled negative control probe sample. f Final puncta detection after being processed in our analysis software showing highlighted example puncta of each target (insets, right). g 3D representation of the field of view for the 10-plex sample. h Number of puncta detected for each gene target expression in each cell for the labeled 10-plex samples (overlaid lines correspond to quantiles [10,50,90]%, n=364 cells). i) Mean puncta counts per cell of transcripts for each gene in the 10-plex samples (left, n=3 experimental replicates, 364 total cells profiled) and negative control probe samples (right, n=3 experimental replicates, 189 total cells profiled). j Correlation of detected puncta (mRNA puncta count) vs. RNA-bulk sequencing (normalized counts) is shown for each target (mean + /− standard deviation, n=3 experimental replicates), yielding a correlation (Pearson r) of 0.96. Scale bar 20 µm in large images and 1 µm in insets. Source data are provided as a Source Data file.

An example field of view is shown in Fig. 4; first the spectral image overlay (five fluorescent channels including DAPI) of the labeled 10-plex SW480 sample (Fig. 4b) and additionally, in the same measurement, the orthogonal lifetime information attained by interrogating each pixel for their lifetime components (Fig. 4c). These pixels were phasor-transformed and pseudo-colored based on their projected phasor coordinates on the universal circle. In doing so, both dimensions of data can now be simultaneously accessed to determine which cluster of pixels meet the appropriate and stringent criteria for puncta classification. Similarly, the composite spectral and lifetime images of the corresponding negative control probe sample are shown (Fig. 4d, e). Figure 4f depicts the now detected pseudo-colored clusters which were successfully classified as one of the RNA markers. A representative inset image for each marker and its targeted detection is provided on the right. Because these are image stacks, the segmentation provides a 3D spatial distribution of the field of view, which can be rendered to visualize the spatial analysis in a 3D context (Fig. 4g).

MOSAICA employs an error-detection strategy that gates for specific and pre-encoded fluorophore combinations and rejects any fluorescent signatures which do not meet these criteria. For instance, of the total detected puncta (n = 65,562), we observed a considerable fraction of puncta, n = 25,053 (38%), which was rejected based on their fluorescence emission of only a single channel (Supplementary Fig. 4c). We characterize this group as the “undetermined group” because each event can belong to: 1) the nonspecific binding of probes, 2) autofluorescent moieties, or 3) mRNA transcripts, which were not fully labeled with both dyes. For the first case, as previously characterized by several groups, nonspecific binding events is a common inherent issue with single-molecule FISH techniques which arises from the stochastic binding of DNA probes towards cellular components such as proteins, lipids, or nonspecific regions of RNA and follow a random distribution14,20. When combined with events which may be autofluorescence moieties (e.g., porphyrins, flavins), which can exist as isolated diffraction-limited structures and emit strong fluorescence in any particular single channel38 or mRNA transcripts which were labeled with only one set of fluorophores, these groups represent a confounding issue for standard intensity-based measurements and analysis because they share similar SNR and intensities to real labeled puncta and cannot be differentiated without additional lengthy or complex techniques such as sample clearing or iterative-based labeling and imaging error correction39. Therefore, the main benefit of implementing the combinatorial encoded criteria is to ensure target detection fidelity by rejecting stochastic and nonspecific binding labeling events, as well as any event eliciting a lifetime signature that deviated from the utilized fluorophores. Finally, we also observed a relatively small group of puncta emitting fluorescent signal across more than two spectral channels but still eliciting the same spectral and lifetime signatures as the utilized fluorophores; n=2,439. To characterize this population, we performed a simulation running 20,000 iterations of various puncta densities and fitted the corresponding exponential model that characterizes the probability of puncta overlap (described in Methods section and Supplementary Fig. 4a,b). We attained an interval for the fraction of lost puncta due to optical crowding ranging from 2.0 to 6.6%, which accounts for the 2,439 puncta (3.7% of the total detected puncta). We name this group the overlapping in Supplementary Fig. 4c.

The number of puncta detected of transcripts for each gene in each cell for the labeled 10-plex samples was plotted (Fig. 4h) and the mean number of detected puncta per cell split into the different genes classified using MOSAICA phasor analysis with combinatorial labeling. In comparison, we also show the MOSAICA pipeline results with the negative control sample obtaining counts of less than five per thousand mainly due to noise in the images (Fig. 4i). To validate these puncta count, we compared them to matching RNA-seq data from the same cell type with n=3 experimental replicates (see replicate comparison in Supplementary Fig. 5). Shown in Fig. 4j is a scatter plot of the average mRNA puncta count for each cell plotted against the normalized counts from DESeq2 of our bulk RNA-sequencing data for each expressed gene. We obtained a Pearson correlation coefficient of r = 0.96, indicating a significant positive association between the two methods. Furthermore, to assess the rate of false positives and determine if one bright mRNA target could potentially be misidentified as another target, we repeated our experiment by leaving out probes for some expressed genes and then compared the detection rate of remaining targets with the 10-plex data. Specifically, we performed two additional experiments with an 8-plex, as well as two additional experiments with a 2-plex panel to compare the detected transcript abundance values and correlation coefficients against the 10-plex sample (Supplementary Fig. 6). We observed that there were no significant differences between these panel sizes in terms of target detection rate, indicating that target misidentification was not an issue for these panel sizes.

To further evaluate the detection efficiency, we performed benchmarking tests with our method against LGC StellarisTM and RNAscopeTM which are commercial gold standard FISH methods (Supplementary Fig. 7). Using the transcript of the housekeeping gene, POLR2A, as an exemplary target, we found a significant association between the number of detected puncta by our method and LGC StellarisTM (t test p value = 0.4). When compared to RNASCOPETM, we observed that for this cell type and target, both our assays and LGC StellarisTM did not correlate significantly (p = 7.8 × 10−4 and p = 3.4 × 10−4), indicating a discrepancy in detection efficiency between the two methods. We attribute this difference to MOSAICA and LGC StellarisTM utilizing a direct labeling and amplification-free method while RNASCOPETM utilizes a tyramide signal amplification reaction which generates thousands of fluorophore substrate per transcript and can lead to overlapping puncta or undercounting of detected puncta. Together, these data show MOSAICA can robustly detect target mRNAs of the broad dynamic range of expression levels from single digit to hundreds of copies per cell.

Multiplexed mRNA analysis in clinical melanoma skin FFPE tissues

We next investigated whether MOSAICA can provide multiplexed mRNA detection and phasor-based background correction and error detection to clinically relevant and challenging sample matrices. Assaying biomarkers in situ in tissue biopsies has great clinical values in disease diagnosis, prognosis, and stratification, including in oncology40,41,42. Specifically, we applied a mRNA panel consisting of KI67 (indicative of cell proliferation), POLR2A, BRCA1, MTOR, NCOA2, and NCOA3 to highly scattering and autofluorescent human melanoma skin biopsy FFPE tissues obtained from and characterized by the UCI Dermatopathology Center. Using the same probe design pipeline, primary probes were encoded with a combination of two fluorophores for the transcript of each gene to exhibit a unique fluorescent signature.

Figure 5b depicts a spectral image overlay (four fluorescent channels including DAPI) of the epidermis region of a labeled 6-plex skin tissue sample. Similarly, as in the previous section, the orthogonal lifetime image was attained after using phasor analysis to create the image depicted in Fig. 5c–e depict the merged composite spectral and lifetime images of the corresponding negative probe sample also in the epidermis region. Figure 5f depicts the pseudo-colored puncta which were successfully classified and identified as their assigned mRNA markers. A representative inset image for each marker and its targeted detection is provided on the right. We observed that a population of puncta consisting of nonspecific, autofluorescent, or unknown sample artifacts rejected from analysis, (1,100) or 37.5% of the total detected puncta (2,934). In addition to this group, MOSAICA rejected a small group of puncta that emitted fluorescence in multiple spectral channels (62). This fraction (2.1%) is in concordance with the optical crowding range (2.0–6.6%) that our simulations and models predict (Supplementary Fig. 4). With conventional intensity-based measurements and analysis, both contaminating groups are inherent image artifacts that compromise the integrity of puncta detection unless complicated quenching steps or additional rounds of stripping, hybridization, and imaging are utilized14,43. With MOSAICA, these contaminating artifacts can be accounted for with the integration of spectral, lifetime, and shape-fitting algorithms.

Fig. 5: Multiplexed mRNA detection in epidermis region of human skin melanoma FFPE tissue.
figure5

a 6 different types of gene transcripts were labeled with primary probes followed by respective and complementary fluorescent secondary probes. Each transcript was labeled with a combination of two different fluorophores for six combinations. Negative control probes targeting transcripts not present in the sample were used with their respective secondary fluorophore probes. b Spectral image (max-projection in z) of a field of view of the labeled 6-plex sample (three channel pseudo coloring). c Lifetime image (max-projection in z) of a field of view of the labeled 6-plex sample (phasor projection on universal circle pseudo coloring). d Spectral image of the labeled negative control probe sample is depicted. e Lifetime image of the labeled negative control probe sample. f Final puncta detection of the 6-plex field of view after being processed in our analysis software showing highlighted example puncta of each target (insets, right). g Mean puncta counts per cell of transcripts for each gene in the 6-plex sample (n=2 experimental replicates, 174 cells). h Puncta count for the negative control probe sample (n=2 experimental replicates, 375 cells). i Correlation of detected puncta (mRNA puncta count) vs. bulk sequencing (fragments per kilobase per million) is shown for each target. j Transcript density in the field of view for each of the expressed genes reveals clustering of specific genes, as an example KI67 appears highly expressed in three cells, one of them marked with a dotted ellipse that corresponds to location in f). Scale bars 10 µm in large images and 1 µm in insets. Source data are provided as a Source Data file.

Figure 5g, h plots the total number of detected puncta for the labeled 6-plex sample and the negative control probe sample to highlight the final counts obtained using MOSAICA. To validate these puncta counts and their relative expressions, we examined the relationship between the decodified puncta with matching bulk RNA-sequencing obtained from The Cancer Genome Atlas (TCGA) database (see Methods section). Shown in Fig. 5i is a scatter plot of MOSAICA puncta count plotted against fragments per kilobase per million. We obtained a Pearson correlation of r=0.97 for this 6-plex sample, indicating a significant positive association between the two methods. We acknowledge that this strong correlation is particularly dependent on the presence of the highly abundant POLR2A expressed gene. The correlation for the other lower expressed targets excluding POLR2A is r=0.44 which, although still positive, is weaker. We attribute this discrepancy to preanalytical variables typically associated with FFPE sample preservation and pretreatment. For instance, there have been multiple studies, which documented increased variability in quantifying lowly expressed genes in FFPE tissues due to RNA degradation or cross-linking of proteins with nucleic acids44,45,46. Last, the density map of the detected transcripts provides a visual method to identify spatial localization of clusters of genes, such as KI67 (indicative of proliferating tumor cells) being more prevalent in the dermis region while POLR2A is dispersed throughout the region (Fig. 5j). Overall, in situ profiling biomarkers, such as KI67 and their spatial clustering can have diagnostic and prognostic values in malignant diseases and MOSAICA provides a robust platform to profile these markers47.

Simultaneous co-detection of protein and mRNA

Spatial multiomics analysis including especially simultaneous detection of protein and transcript within the same sample can reveal the genotypic and phenotypic heterogeneity and provide enriched information for biology and disease diagnosis. As a pilot experiment to demonstrate MOSAICA’s potential for multiomics profiling, we utilized MOSAICA to detect 2 protein targets, Tubulin and Vimentin, and 2 mRNA targets, POLR2A and MTOR in colorectal cancer SW480 cell culture samples (Fig. 6). After staining the sample with the primary antibodies, secondary antibodies were added to fluorescently label the protein targets. After protein labeling, we utilized the same probe design pipeline and labeling strategy for mRNA detection, primary probes were generated and hybridized to the sample after antibody staining. Corresponding secondary probes were hybridized. Figure 6a–f depict the individual channels of the sample with Fig. 6g showing the merged channels of the 4-plex panel. As both POLR2A and MTOR are assigned to the 647 nm channel and cannot be separated spectrally (Fig. 6d), lifetime analysis is used to separate POLR2A (Fig. 6e) and MTOR (Fig. 6f). Signal-to-noise ratio measured as intensity of the detected puncta over intensity of the surrounding pixels was measured for the two mRNA targets (Fig. 6h). In summary, we have demonstrated MOSAICA as a potential spatial multiomics tool, which harmonizes sample treatment between both labeling processes. MOSAICA utilizes staining protocols with efficient target retrieval, blocking, and pretreatment steps where the viability and labeling of both target RNA sequence and protein markers were not compromised after each assay.

Fig. 6: Simultaneous 4-plex co-detection of protein and mRNA in colorectal cancer SW480 cells.
figure6

a Intensity imaging showing nuclei labeled with DAPI. b Intensity image showing Tubulin protein labeled with Alexa488. c Intensity image showing Vimentin protein labeled with TRITC. d Intensity image at 647 nm showing mRNA targets, POLR2A and MTOR, which were further resolved by lifetime. e Unmixed lifetime image showing POLR2A puncta labeled with Alexa647. f Unmixed lifetime image showing mTOR puncta labeled with ATTO647. g Merged image of all channels. Scale bar is 10 µm. h Signal-to noise and puncta count analysis for the mRNA targets. Overlaid lines correspond to quantiles [10,50,90]%, n=1757 and n=681 transcripts respectively. Source data are provided as a Source Data file.

Source link