Preloader

SM-Omics is an automated platform for high-throughput spatial multi-omics

Ethical statement

All work involving C57BL/6 J mice was performed under specific-pathogen-free conditions and the guidelines of the Division of Comparative Medicine, in accordance with the Institutional Animal Care and Use Committees (IACUC) relevant guidelines at the Broad Institute of Harvard and MIT, and consistent with the Guide for Care and Use of Laboratory Animals, National Research Council, 1996 (institutional animal welfare assurance no. A4711-01), with protocol 0122-10-16.

Bravo system requirements

Bravo Automated Liquid Handling Platform (Agilent Technologies, USA) was equipped with a 96LT pipetting head (G5498B#042, Agilent Technologies, USA) and two Peltier thermal stations (CPAC Ultraflat HT 2-TEC, #7000166 A, Agilent Technologies, USA) with PCR adapter having a mounting frame at positions 4 and 6 on the Bravo Deck and connected to an Inheco MTC Controller. On position 7, we recommend the MAGNUM FLX™ Enhanced Universal Magnet Plate (#A000400, Alpaqua, USA) to serve for magnetic bead-based clean ups. In addition, a BenchCel NGS Workstation (Front-load rack at 660 mm height) and BenchCel Configuration Labware MiniHub (option #010, Agilent Technologies, USA) were included in the automation platform setup. In case in situ reactions were performed, the PCR adapter was removed from position 6 to be replaced with Aluminum Heat Transfer Plate (#741I6-GS-4, V&P Scientific, Inc, USA). This liquid handling setup enables running in situ reactions using the ProPlate Multi-Array slide system (GraceBioLabs, USA), where 64 reactions can be run in parallel using the standard 96LT pipetting head. Note that every third column in the 96-tip pipette box needs to be removed when using the ProPlate Multi-Array system with standard Agilent Bravo pipetting instrumentation. All library preparation reactions are run in a maximum 96-well mode, however lower throughput adjustments are predefined as 8-sample increments and easily loaded in our automated SM-Omics settings. Further details in the SM-Omics protocol sections below, and at: https://github.com/klarman-cell-observatory/sm-omics/tree/master/SM_Omics_v.B1.0.2.

Sample collection and cryosectioning

All work involving C57BL/6 J mice was performed under specific-pathogen-free conditions and the guidelines of the Division of Comparative Medicine, in accordance with the Institutional Animal Care and Use Committees (IACUC) relevant guidelines at the Broad Institute of Harvard and MIT, and consistent with the Guide for Care and Use of Laboratory Animals, National Research Council, 1996 (institutional animal welfare assurance no. A4711-01), with protocol 0122-10-16. A small piece of freshly collected tissue (~25–50 mg, about 5 × 5 mm) was placed on a dry and sterile Petri dish, which was placed on top of wet ice. The tissue was then very gently moved using forceps and placed on another dry part of the Petri dish to ensure little liquid was present around the tissue. The bottom of a cryomold (5 × 5 mm, 10 × 10 mm or 25 × 20 mm) was filled with pre-chilled (4 °C) OCT (Tissue-Tek; Sakura Finetek, USA) and the tissue transferred with forceps into the OCT-prefilled mold. The entire tissue surface was covered with pre-chilled OCT. The mold was then placed on top of dry ice and allowed the tissue to freeze for up to 5 min until OCT has turned completely white and hard. The tissue cryomolds were stored at −80 °C until use. For cryosectioning, the ST slide and the tissue molds first reached the temperature of the cryo chamber. The OCT-embedded tissue block was attached onto a chuck with pre-chilled OCT and allowed to freeze ~5–10 min. The chuck was placed in the specimen holder and adjusted the position to enable perpendicular sectioning at 10 µm thickness. Sections were gently transferred to a ST array24 and then the back side of the slide was warmed ~10–15 s with a finger. ST slides with tissue sections on top could be stored at −80 °C for up to 6 days.

Tissue fixation and H&E staining

The ST slide with the tissue section was warmed to 37 °C for 1 min on a thermal incubator (Eppendorf Thermomixer Option C, Germany). The tissue was then covered with 4% formaldehyde (Sigma-Aldrich, USA) in 1X PBS (Thermo Fisher Scientific, USA) for 10 min at room temperature (RT). The whole slide was then washed in 1X PBS in a vertical orientation to be placed back on a horizontal place for drying. 500 µl isopropanol covered the tissue and ensured drying. The slide was put into an EasyDip Slide Jar Staining System (Weber Scientific) holder and the same system used for H&E staining. Five ~80 ml containers were prepared with Dako Mayers hematoxylin (Agilent, USA), Dako Bluing buffer (Agilent, USA), 5% Eosin Y (Sigma–Aldrich, USA) in 0.45 M Tris acetate (Sigma–Aldrich, USA) buffer at pH 6 and two jars with nuclease-free water (ThermoFisher Scientific, USA). The slide rack was fully immersed in hematoxylin for 6 min and then washed by dipping the slide rack in a nuclease-free water jar 5 times following another destaining wash by dipping the slide rack in 800 mL nuclease-free water for 30 times. The slide rack was put into the Dako bluing buffer and incubated for 1 min. The slide was again washed by dipping the rack 5 times in the second nuclease-free water jar. The slide rack was finally put into the eosin and incubated for 1 min to be washed by dipping the rack 7 times in the second water jar. The slide was removed from the rack to allow it to dry.

Tissue fixation and IF staining

The ST slide with the tissue section was warmed to 37 °C for 4 min on a thermal incubator (Eppendorf Thermomixer Option C, Germany) and in situ fixed and washed as described above. The slide was then mounted in the plastic slide holder (ProPlate Multi-Array slide system; GraceBioLabs, USA) compatible with the Aluminum Heat Transfer Plate (#741I6-GS-4, V&P Scientific, Inc, USA) on position 6 on the Bravo deck. All following antibody incubations were performed at 4 °C. First, the tissues were blocked with the TruStain FcX™ PLUS (anti-mouse CD16/32, Biolegend, USA) antibody (1:100 dilution) in 0.5% Triton X-100 (Sigma-Aldrich, USA) for mouse brain tissues and 1× perm/wash buffer (ThermoFisher Scientific, USA) for splenic tissues. This simultaneous blocking and permeabilization step lasted for 30 min. Next, the slide was washed 3× with 1× PBS (ThermoFisher Scientific, USA). After discarding the last wash, the slides were incubated with 1× PBS for 2 min. Then, antibodies were added at 1:100 dilution for 90 min. The complete list of antibody clones and suppliers is available in Supplementary Data 3. The slide was again washed in the same fashion and counterstained with DAPI (Sigma–Aldrich, USA) diluted 1:1000 in 0.5% Triton X-100 (Sigma–Aldrich, USA) for 5 min. In case the reactions were performed on a SM-Omics array and not a mock polyd(T) array, the DAPI reaction was also supplemented with a Cy3 labeled anti-frame DNA probe (5′-Cy3-GGTACAGAAGCGCGATAGCAG-3′, IDT, USA) at 10 nM concentration. In case DAPI counterstaining was not used, the step was skipped. This was followed by another wash cycle. The slides were then air dried and mounted with 85% glycerol prior to imaging.

Tissue fixation and DAPI-only staining

Similarly to performing Tissue fixation and IF staining, tissue sections were attached to slides and in situ fixed. The slide was then mounted in the plastic slide holder (ProPlate Multi-Array slide system; GraceBioLabs, USA) and all reactions performed at 4 °C. Tissues were first incubated with 0.5% Triton X-100 (Sigma–Aldrich, USA) for 25 min. Next, the slide was washed 1x PBS (ThermoFisher Scientific, USA) and the tissue stained with DAPI (Sigma–Aldrich, USA) diluted 1:1000 in 0.5% Triton X-100 (Sigma–Aldrich, USA) for 15 min. If the reactions were performed on a SM-Omics array and not a mock polyd(T) array, the DAPI reaction was also supplemented with a Cy3 labeled anti-frame DNA probe (5′-Cy3-GGTACAGAAGCGCGATAGCAG-3′, IDT, USA) at 10 nM concentration in order to facilitate image registration to the SM-Omics array coordinates. This was followed by another wash cycle. The slides were then air dried and mounted with 85% glycerol prior to imaging.

Automated imaging

Images of stained H&E tissue sections on the ST slides were taken using a Metafer Vslide scanning system (MetaSystems, Germany) installed on an Axio Imager Z2 microscope (Carl Zeiss, Germany) using an LED transmitted light source and a CCD camera (BF scanning). All images were taken with the A-P 10x/0.25 Ph1 objective lens (Carl Zeiss, Germany). For fluorescent scanning, a PhotoFLuor LM-75 lightsource (89North, USA) was used in combination with a Plan-APOCHROMAT 20x/0.8 objective (Carl Zeiss, Germany). A configuration program was made to enable automatic tissue detection, focusing and scanning on all ST arrays present on a glass slide. In short, tissue detection was based on contrast as compared to normalized background in all channels. Upon finding maximum contrast in a 12-step spiral-like search window field of view (FOV) pattern, the automated focal alignment in every second of each FOV (4096 × 3000 px) was initiated. The alignment search considered the maximum contrast z-position as in-focus using 5 µm stage intervals (n = 19 focal planes). The BF scanning of the predefined ST array areas was done in a total of 48 FOVs and ~30 s in 3 channels (RGB); or epifluorescent scanning of 228 FOVs and ~6 min for 3 fluorescent channels. Images were stitched using 60 µm overlap and linear blending between FOVs with the VSlide software (v1.0.0) and then extracted using jpg compression. Multiple ST slides can be processed in the same manner without any user input for a total of 6 min processing time per H&E stained slide (3 channels) or 45 min for fluorescently stained slide (3 channels), including image stitching.

Microarray design and production

Both for quality control experiments and library preparation, the Codelink amine activated slides (#DN01-0025, Surmodics, USA) were exposed with polyadenylated oligonucleotides (IDT, USA) and microarray production proceeded as according to manufacturer’s instructions (Surmodics, USA). The surface oligonucleotides are presented here for clarity:

([AmC6]UUUUUGACTCGTAATACGACTCACTATAGGGACACGACGCTCTTCCGATCT[18nt]NNNNNNN[20 T]VN).

This chemistry design enabled covalent linking upon binding to the Codelink slide surface. For library preparation slide production, 33 μM spatially barcoded oligonucleotides (IDT, USA) were deposited as 100pL droplets onto Codelink slides as suggested by the manufacturer (Surmodics, USA). This resulted in about ~200 million copies of the oligonucleotide per spatial spot. Array printing was performed by ArrayJet LTD (Scotland, UK) according to the ArrayJet Spider system requirements. Each library preparation slide active area had a total of 1,007 spatially barcoded positions distributed over a ~42 mm2 area. Each spatially barcoded ST spot had a diameter of 100 μm, with a center-to-center distance of 200 μm between the spots.

SM-Omics automation

The SM-Omics protocol is divided into three main parts. The first part (1) processes all in situ reactions on a ST slide: tissue pre-permeabilization, permeabilization, reverse transcription with or without the release of the spatial capture probes and tissue removal. This material is collected to a standard 96-well PCR microplate (Eppendorf, Germany) and all of the following reactions (protocols 2 and 3) are run in 96-well plates. The second protocol (2) contains second strand synthesis reaction, cDNA bead purifications and T7 in vitro transcription. The third protocol (3) includes aRNA adapter ligation, bead purifications and second cDNA synthesis. The material is then quantified using a standard qPCR protocol and the libraries accordingly indexed for Illumina sequencing.

Reference material preparation

In order to test reproducibility of library preparation reactions, we prepared reference material as input. 7.5 µg of universal mouse reference RNA (#740100, Agilent Technologies, USA) was fragmented using NEBNext Magnesium RNA fragmentation module (NEB, USA) for 1 min at 94 °C. The sample was purified with a MinElute Cleanup kit (Qiagen, Germany) according to manufacturer’s instructions and RNA concentration and size were assessed using a Qubit RNA HS kit (ThermoFisher Scientific, USA) and Bioanalyzer Pico 6000 kit (Agilent Technologies, USA), respectively. ~2 µg of fragmented RNA was incubated with either 3.3 µM custom hexamer primer (GACTCGTAATACGACTCACTATAGGGACACGACGCTCTTCCGATCTNNNNNNNN, T7handle_IlluminaAhandle_hexamer) or poly(d)T primer (T7handle_IlluminaAhandle_hexamer_20TVN) in the presence of 0.8 mM dNTP (ThermoFisher Scientific, USA) at 65 °C for 5 min. First strand reverse transcription was performed with a final concentration of 1X First Strand Buffer, 5 mM DTT, 2U/µl RNaseOUT and 20U/µl of Superscript III (all from Thermo Fisher Scientific, USA). The reaction was incubated at 25 °C for 10 min (when using hexamer priming), followed by 50 °C for 1 h and 70 °C for 15 min or 50 °C for 1 h and 70 °C for 15 min for poly(d)T priming. The reaction was purified with AMPure XP beads (Beckman Coulter, USA) at a beads/DNA ratio of 0.8:1. The concentration of the material was measured on a Qubit RNA HS kit (ThermoFisher Scientific, USA) and diluted in EB (Qiagen, Germany). A release mixture of ~100 ng (hexamer priming) or ~200 ng (poly(d)T priming) first strand cDNA, 1X Second strand buffer (ThermoFisher Scientific, USA), 0.2 µg/µl BSA and 0.5 mM dNTP (ThermoFisher Scientific, USA) was used to test all library preparation reactions. Hexamer primed cDNA was used to test the reproducibility and poly(d)T primed cDNA was used to test adapter concentrations and ligation time.

In situ SM-Omics protocol (1)

Tissue-stained ST slides we provided as input. The ST slide was attached into the ProPlate Multi-Array slide system (GraceBioLabs, USA), with up to four ST slides fitted. The ProPlate Multi-Array system was then fixed in position by Aluminum Heat Transfer Plate (VP 741I6-GS-4, V&P Scientific, Inc, USA) on the Agilent Bravo deck. The protocol started with tissue pre-permeabilization (30 min at 33 °C) with addition of 120 µl reagent per well of exonuclease I buffer for brain samples (NEB, USA) or 120 µl reagent per well of collagenase I (200U) in 1x HBSS (both from Thermo Fisher Scientific, USA) for colorectal samples. For spleen sections, the pre-permeabilization step was skipped. For complete removal of the reagents and wash solutions from the subarrays all of the robotic dispensing and aspiration steps took place in all four corners of the square wells. Pre-permeabilization reagent removal was followed by a 180 µl wash in 0.1X Saline Sodium Citrate (SSC, Sigma–Aldrich, USA) at 33 °C. Next, tissue permeabilization was done using 75 µl 0.1% pepsin (pH 1, Sigma–Aldrich, USA) at 33°C for 10 min (mouse brain) and 15 min (colorectal cancer) and for 60 min (spleen) 75 µl 0.1% pepsin prepared at pH 2.5 in Tris-HCl (Sigma-Aldrich, USA). After a 180 µl 0.1X SSC wash at 33 °C, in situ cDNA synthesis reaction was performed by the addition of 75 µl RT reagents: 50 ng/µl actinomycin D (Sigma–Aldrich, USA), 0.5 mM dNTPs (Thermo Fisher Scientific, USA), 0.20 µg/µl BSA, 1 U/µl USER enzyme (both from NEB, USA), 6% v/v lymphoprep (STEMCELL Technologies, Canada), 1 M betaine (#B0300-1VL, Sigma-Aldrich, USA), 1X First strand buffer, 5 mM DTT, 2 U/µl RNaseOUT, 20 U/µl Superscript III (all from Thermo Fisher Scientific, USA). The reactions were sealed with 70 µl of white mineral oil Drakerol#7 (Penreco, USA). Incubation at 30 °C was performed for a minimum of 6 h, after which 70 µl of the released material was collected in a new 96-well PCR plate (Eppendorf, Germany). When a Cy3 fluorescent cDNA activity print was needed for tissue optimization, the 75 µl in situ cDNA reaction mix was as follows: 50 ng/µl actinomycin D (Sigma-Aldrich, USA), 0.20 µg/µl BSA (NEB, USA), 1X M-MuLV buffer, 5 mM DTT, 2U/µl RNaseOUT, 20U/µl M-MuLV (all from Thermo Fisher Scientific, USA), 4 µl dNTP mix (dATP; dGTP and dTTP at 10 mM and dCTP at 2.5 mM) and 2.2 µl Cy3-dCTPs (0.2 mM, Perkin Elmer, USA).

In situ manual ST protocol

The manual ST in situ protocol was performed as described in Salmén et al.47. The protocol is, if not mentioned below, identical to the robotic protocol except as further described. Tissue-stained ST slide was attached in an ArrayIT hybridization chamber (ArrayIT, CA). All incubations took place on an Eppendorf Thermocycler R (Eppendorf, Germany), and reactions were covered with Microseal ‘B’ PCR Plate Seals (Biorad, CA) to avoid evaporation. Pre-permeabilization and washes were performed with 100 µl reagent at 37 °C and the in situ cDNA synthesis reaction was run without the USER enzyme, lymphoprep and betaine, at 42 °C. The manual protocol then encompassed tissue removal and probe release as described47. Tissue removal took place in two separate steps with RLT buffer with β-mercaptoethanol and Proteinase K. 80 µl of 1% β-mercaptoethanol (Sigma-Aldrich, USA) in RLT buffer (Qiagen, Germany) were added to the wells and incubated at 56°C for 1 h. Following removal of the reaction mix and wash with 0.1X SSC solution, 80 µl of second tissue removal mixture; 2.5 µg/µl Proteinase K in PDK buffer (Qiagen, Germany) were added and the reaction was performed at 56 °C for 1 h. The complete reaction mix was again removed and a slide wash with one 10 minute wash of the wells with 2X SSC/0.1% SDS (Sigma-Aldrich, USA), followed by 1 min wash with 0.2X SSC and finally 0.1X SSC was performed. Cleavage of probes from the surface was performed in the next steps and not during in situ cDNA synthesis. The reaction mix consisted of 1.1X Second strand buffer (ThermoFisher Scientific, USA), 0.1 mM dNTPs and 1 U/µl USER enzyme (NEB, USA). 75 µl of the mix was added and incubated for 3 h at 37 °C. The released material was collected in a new 96-well PCR plate (Eppendorf, Germany) by aspirating 70 µl of the released material.

SM-Omics library preparation (2)

Upon initiating the Agilent Bravo form the user was prompted to select either: 1, 2, 3, 4, 6 or 12 columns of the 96-well plate to run. Two positions on the Bravo deck had Peltier thermal stations (4–95 °C) in the standard 96-well format. A reagent plate was prepared for robotic aspiration, transfer and dispensing of reagents. First, single-stranded cDNA was made to double-stranded material using 5 µl of the reaction mix (2.7X First strand buffer, 3.7 U/µl DNA polymerase I and 0.2 U/µl Ribonuclease H (all from ThermoFisher Scientific, USA)) for 2 h at 16°C. Thereafter, the material was blunted by the addition of 5 µl of 3U/µl T4 DNA polymerase (NEB, USA) for 20 min at 16 °C. The reaction was stopped by addition of Invitrogen UltraPure 0.5 M EDTA (pH 8.0, ThermoFisher Scientific, USA) to a final concentration of 20 mM. The material was then purified using Ampure XP (Beckman Coulter, USA) at a bead to cDNA ratio of 1:1. Next, 27.8 µl of the T7 reaction mix (46.2 mM rNTPs, 1.5X T7 reaction buffer, 1.54 U/µl SUPERaseIN inhibitor and 2.3 U/µl T7 enzyme; all from ThermoFisher Scientific, USA) was added and sealed with 40 µl of Vapor-Lock oil (Qiagen, Germany) for an overnight 14 h incubation at 37 °C. After incubation, 2.1 µl of nuclease-free water (ThermoFisher Scientific) was added and the Vapor-Lock was removed, followed by a bead cleanup with RNAclean Ampure XP beads (Beckman Coulter, USA) at a ratio of 1.8:1 of beads:aRNA. The material was then assessed with a Bioanalyzer RNA 6000 Pico kit (Agilent Technologies, USA). 8 µl of the eluted 10 µl aRNA was transferred into a new 96-well PCR plate (Eppendorf, Germany).

SM-Omics library preparation (3)

2.5 µl of either 3 µM (standard) or 15 µM aRNA adapters (efficient) [rApp]AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC[ddC] were added to 8 µl of aRNA. The reaction was then incubated at 70 °C in a PCR machine for 2 min and immediately chilled on wet ice. The user then again selected the number of columns they wished to run. 4.5 µl T4 RNA ligation mix (3.3X T4 RNA ligase buffer, 66U/µl truncated T4 ligase 2 and 13U/µl murine RNAse inhibitor (all from NEB, USA)) were added to the aRNA/adapter solution. The ligation reaction took place at 25 °C for 1 h (standard) or 3 h (efficient). For the SM-Omics protocol, the ligation reaction was performed for 3 h in the presence of 15 µM aRNA adapters. The ligation was followed by an Ampure XP (Beckam Coulter, USA) bead purification at a ratio of 1.8:1 bead:cDNA. Elution volume was 12 µl. After bead purification, 2 µl of a primer and dNTP mix (1:1 v/v of either 20 µM or 40 µM GTGACTGGAGTTCAGACGTGTGCTCTTCCGA and 10 mM dNTPs) were added to the ligated samples. For the SM-Omics protocol, 40 µM primer amount was added using the same volumes. Then, the samples were sealed with 40 µl Vapor-Lock (Qiagen, Germany) and heated to 65 °C for 5 min. The Vapor-Lock was thereafter removed and 8 µl of reverse transcription mix were added (2.5X First strand buffer, 13 mM DTT, 5 U/µl RNaseOUT and 25 U/µl Superscript III; all from Thermo Fisher Scientific, USA), with the addition of 40 µl Vapor-Lock to reseal the reaction. The samples were incubated at 50 °C for 1 h. 10 µl of nuclease-free water was added followed by a final Ampure XP bead purification at 1.7:1 bead:cDNA ratio with a final elution of 10 µl nuclease-free water.

Staining tissues with oligonucleotide-conjugated antibodies

As described above, the fresh frozen tissue was placed on the spatial array slide and fixed at RT, followed by antibody incubations at 4 °C. First, tissues were blocked and permeabilized as described above. This was followed by a series of 3 washes in 1X PBS and a last wash that was incubated for 2 min. After discarding the wash, oligonucleotide-conjugated antibodies and fluorescently labeled antibodies (Biolegend, USA) were both added at a 1:100 dilution in the same buffer as in the initial permeabilization step and incubated for 1 h. The tissue was then washed and the antibody conjugates fixed to the array surface in 4% PFA (Sigma-Aldrich, USA). Tissues were then fluorescently imaged and SM-Omics libraries created. The following steps were added in the library preparations to ensure collection of spatially DNA-barcoded antibody tags. First, cDNA synthesis was performed in situ under the same conditions as described above. Next, second strand synthesis was also performed as described followed by an Ampure XP bead clean up as according to manufacturer’s instructions. During this clean up, material that would otherwise have been discarded after binding to the beads in standard SM-Omics library preparations, was saved and represented a population of spatially DNA-barcoded antibody tags. This elute contained short products that required a bead clean up procedure as well, where a 1.4X bead-to-material ratio was used and the final product eluted in 50 µL EB (Qiagen, Germany). This material was then indexed for Illumina sequencing using Small RNA Illumina indexes in a KAPA indexing reaction as described in Quantification, indexing and sequencing.

Manual ST library preparation

Manual library preparation was performed as described in Salmén et al.47 and included the same experimental steps as the robotic library preparation protocol, but performed manually, incubations took place in a PCR System Eppendorf Mastercycler (Eppendorf, Germany) and instead of Vapor-Lock, reactions were sealed using MicroAmp Optical 8-Cap Strips (ThermoFisher Scientific, USA). The manual procedure also included the following deviations from the robotic library preparation: T7 reaction mix of 18.6 µl was used and 1.4 µl of nuclease-free water was added after the 14 h incubation.

Manual visium preparation

Cortical tissues from an adult mouse brain were cryosectioned at 10 µm thickness and placed on Visium capture areas. The protocol was followed as in the Visium Spatial Gene Expression User Guide CG000239 Rev B as provided by 10X Genomics.

Quantification, indexing and sequencing

qPCR library quantification and indexing were performed as described in Salmén et al.47. The indexed SM-Omics cDNA libraries were diluted with 40 µl of nuclease-free water to allow for a final library bead cleanup with 0.8:1 ratio Ampure XP beads to PCR products, according to the manufacturer’s protocol. Final elution was done in 16 µl EB (Qiagen, Germany). Individual libraries’ fragment lengths and concentrations were evaluated on a Bioanalyzer HS (Agilent Technologies, USA) or DNA1000 Tapestation (Agilent Technologies, USA) and DNA HS Qubit assays (ThermoFisher Scientific, USA), respectively. Samples were then diluted to the desired concentration for sequencing (~1.08 pM final for NextSeq sequencing with 10% PhiX) and sequenced 27–30nt in the forward read and 55–58nt in the reverse read. For antibody tags, the final clean-up was performed at 0.9:1 ratio of beads to PCR products and elution again done in 16 µl EB (Qiagen, Germany). Samples were diluted to 8pM final concentration before sequencing on an Illumina Miseq (2 × 25nt).

Statistics and reproducibility

Raw reads processing and mapping

ST, SM-Omics, Visium or antibody tag fastq reads were generated with bcl2fastq2. ST Pipeline59 v.1.7.6 was used to demultiplex the spatial barcodes and collapse duplicate UMI sequences for ST, SM-Omics and Visium. In short, 5nt trimmed R2 was used for mapping to the mouse genome (GRCm38 primary assembly available at https://www.ncbi.nlm.nih.gov/assembly/GCF_000001635.20/) using STAR (v2.6.0)60. After that, mapped reads were annotated using HTseq-count (v0.11.4)61 using the m11 gtf file (https://www.gencodegenes.org/mouse/release_M11.html). To collapse UMIs, the annotated reads needed to first be connected to a spatial barcode using a TagGD59,62 (v0.3.6) demultiplexer (k-mer 6, mismatches 2). Then, UMIs mapping to the same transcript and spatial barcode were collapsed using naive clustering with one mismatch allowed in the mapping process. The output file was a genes-by-barcode matrix that was used in all further processing steps. To map antibody tags to their respective spatial barcodes, we used the tag quantification pipeline originally developed for CITE-Seq (v.1.4.3) available at https://github.com/Hoohm/CITE-seq-Count. The pipeline was run with default parameters (maximum Hamming distance of 1). We additionally provided the spatial barcodes and corrected the spatial mapping (1 mismatch) for a total of 1007 different barcodes.

SpoTteR: automated image registration for spatial transcriptomics arrays

For efficient processing, HE images were scaled to approximately 500 × 500 pixels using the imagemagick (https://imagemagick.org/index.php) mogrify command. Other image operations as mentioned below were performed using the R package imager (http://dahtah.github.io/imager/) unless specified differently. In order to reconstruct the positions of all ST spots, visible (i.e., not covered by the tissue section) barcode (x,y) spots were registered through blob detection and then refined by keeping only those blobs (i.e. potential grid points) that were likely to be part of a regular grid. Blob detection refers to finding circular features of a predefined size in the image. To prepare the H&E image for blob detection, the tissue section was masked generously through 10% quantile thresholding in a user-defined color channel as obtained through the function imsplit. The borders of the resulting image were cropped four pixels from each of the four image borders in order to remove any abnormality or border effects that might interfere with blob detection. For blob detection, we first blurred the cropped image isotropically using the function isoblur with sigma = 3 and then computed the image hessian (function imhessian). This allowed us to detect probable blob centers using the function dilate_square with size = 3 and the function pad with nPix = 4 and pos = −1. Blob centers (i.e., potential grid points) that were likely part of a regular grid structure were selected by calculating the x and y distances between all detected blob centers. Those blob centers that based on their 8 nearest neighbors had a high (empirically determined) combined grid score (metric based on distance and grid angle between the neighboring centers) were kept. A regular grid was then fitted to these potential grid points using a custom optimizer built around the function nlminb of the R package stats, that minimizes the distance of potential grid points to the suggested regular grid, while assuming 90° angles and 42 grid points per row and column. This first, rough grid initialized an iterative process in which the 0.1% potential grid points that least fit the grid were removed in each iteration, the number of grid points per row and column were updated accordingly, and a new grid was fitted until the target number of grid points per row (here 35) and column (here 33) was reached. The grid design and target number of rows and columns is fully adjustable in SpoTteR. Finally, those grid points that overlapped the tissue sections were identified by building a mask that represented the tissue area and registering all grid points that were present in this mask. To build this mask, we calculated the mean and standard deviation of the background intensity based on the first 20 pixels from the image border, because no tissues were expected in that area. Pixels with an intensity greater than the mean background intensity adjusted for its standard deviation were set to 1 (likely tissue) and those below or equal the mean background intensity adjusted for its standard deviation were set to 0 (likely background), creating the primary tissue mask. Complementarily, a background mask was created by selecting all likely background pixels using the function px.flood with sigma = 0.1 and removing those pixels from the primary tissue mask in order to remove dark artefacts. Two final rounds of isotropic blurring using the function isoblur with sigma = 10 and sigma = 1 in combination with intensity thresholding enhanced the detection of weakly colored tissue regions such areas around the tissue edges. In order to further accommodate atypical tissue coloring, bubbles, and smears present as imaging artifacts, we introduced a parameter to specify the usage of the green color channel instead of the red color channel for tissue detection, which exploits the observation that smears and H&E staining artefacts often lead to spurious pink coloring, which is especially strong in the red channel. To address bubbles, another common image artefact, we introduced a parameter that allows the creation of an additional bubble mask based on all three color channels that specifically identifies bubbles as features that have roughly the same low intensity in all three color channels as they are typically dark gray or even black in color. The thus identified likely bubble pixels are then also removed from the tissue mask. These two additional (TRUE/FALSE) parameters enable to easily process data from tissues of various degrees of coloration and bubble artefacts. Finally, an intermediate report notifies the user of irregularities in the automatic alignment process and allows for visual inspection. The output.tsv file contained barcode spots (x,y) as centroid pixel coordinates of the detected grid, as well as a TRUE/FALSE value, set as TRUE if the barcode spot was detected as under the tissue section area.

SpoTter integration with ST pipeline and quality control reporting

The following steps integrate the output from the automated image alignment steps with the output gene-by-barcode expression file as produced by the ST Pipeline v.1.7.6. The barcode (x,y) spots approximated as under the tissue section were used for subsetting the ST Pipeline gene-by-barcode file. Then, the original H&E images were downscaled and cropped using the following imagemagick commands: convert HE_image.jpg -crop width“x“height+xa+ya; where width and height represented the Euclidean lengths between (x,y) grid detected barcode spots (33,35), (1,35) and (1,35), respectively. xa and ya were described as the centroid pixel coordinates of the grid point (33,35). The cropped H&E image was then rotated as follows: mogrify -flop -flip HE_image.jpg and this image was then used as input to the QC reporting system and for the GUI annotation tool. A final quality control (QC) report was created when running SpoTteR. All code for running image registration and QC reporting with SpoTteR has been made available at: https://github.com/klarman-cell-observatory/SpoTteR.

Comparison of SpoTter vs. ST spot detector vs. manual alignment

To be able to compare the automated image processing developed here to that of manually processed images, we acquired an additional image of the ST array area after the experiment was performed and the tissue had been removed from the array surface. Briefly, complementary and Cy3 labeled oligonucleotides (IDT, USA) were diluted in 2X SSC with 0.05% SDS to a final concentration of 1 µM. 50 µl of the diluted solution was added to the array surface and incubated with shaking (50 rpm) for 10 min at RT. This was followed by washing the slide in 4XSCC with 0.1% SDS and 0.2X SSC. The array frame and all ST barcode positions had then efficiently been labeled and acquired on the same imaging system as described. All input images in the following comparisons were the same approximate input sizes and resolution. The ST spot detector tool previously developed48 uses the H&E and Cy3 images as input. Due to its intrinsic scaling factor and input image size requirements, initial pre-processing of both images was needed, such that images be linearly downscaled to 30% of their original size and both images individually cropped to represent the same FOVs as collected during the imaging step. However, cropping was only needed if the user did not have the possibility to automatically acquire the same FOVs using the same starting (x,y) positions. For manual alignment, we used Adobe Photoshop for initial pre-processing, same as in the previous step. Both the H&E and Cy3 acquired images were downscaled to 30% of their original size, rotated 180 degrees and aligned to the same starting (x,y) pixel coordinates. This was followed by cropping both images along the middle of the first and last row and column. The tissue boundaries were detected using the magic wand function (32px) and the selection subtracted in the Cy3 image. Spots boundaries were again detected using the same magick wand function and the background noise cleaned up using the bucket fill function (250px) in a grayscale image. This grayscale image was further used in Fiji63 to detect the centroid coordinates of each ST barcode spot. Following Fiji processing, we translated (x,y) pixel centroid coordinates to ST barcode spot coordinates (as given during the demultiplexing step in the ST pipeline). For SpoTteR input, we only provided the original H&E imaged as acquired by the imaging system with no GUI-based preprocessing. For speed comparisons, total time needed for preprocessing steps was measured first. For manual processing, the pre-processing steps included alignment of the H&E and Cy3 images with Adobe Photoshop 2019 and creation of an ST array spots files. For ST Detector pre-processing time, we only took into consideration the time needed to open the same images in Adobe Photoshop, downscale them to 30% size and crop them the same size without any other image handling processes performed. For SpoTteR, preprocessing included the downscaling step performed with imagemagick and incorporated into the workflow. Processing steps were then performed and time was measured as described before. Total speed was considered as 1/t [s−1] where t represents the sum of time needed for both the pre-processing and processing steps. False positive and negative rates were calculated as percentage of spots present or absent in SpoTteR or ST Detector as compared to manually processed spot coordinates.

Estimating lateral diffusion

Two consecutive mouse cortex fresh frozen sections were processed. One was processed manually as described earlier47 while the other was processed using our devised robotic liquid handling setup. For these tests, we created poly(d)T arrays in-house according to manufacturer’s instructions (Codelink, Surmodics, USA) using amine-activated slides. The surface area covered with poly(d)T probes was 6x6mm. Both the H&E and gene activity Cy3 images were processed in Fiji63. First, in order to detect the nuclear boundaries of cells chosen at random throughout the tissue, we drew a line (Straight > Freehand line) through each visible nucleus (n = 50). Secondly, we collected pixel intensities and distances reaching through each of the chosen nuclei and its surrounding area (Analyze > Plot Profile). To distinguish nuclear boundaries in the collected intensity vs. distance data, we first fit a 5th degree polynomial of the curve. Then, we found local minima and maxima in each curve and determined cell boundaries as local minima present at above 10% signal intensity of the local maximum value for each curve. After cell boundaries were defined, we repeated the process using the Cy3 fluorescent gene activity image. Finally, we measured the distance between the detected Cy3 and nuclear signals for each selected cell. Left and right cell boundaries representing opposite sides of each cell were used in the estimate in each condition. A 0.1728 pixel to distance conversion ratio was used to transform pixels to micrometers reported in this paper. If a diffusion distance measure was scored as negative it implied that the Cy3 signal was contained within the detected cell boundaries, and positive if outside those same boundaries.

Estimating reproducibility of SM-Omics in situ reactions

Scikit-image64 was used to process the H&E and respective fluorescent gene expression images. First, a grayscale fluorescent image was smoothed using a Gaussian filter (sigma = 0.01). Then, we applied morphological reconstruction by dilating the image edges through filtering its regional maxima. This enabled us to create a background image value that could be subtracted from the original image and used in further analysis. Then, we created an elevation map with a Sobel filter to mask the elevated points. This image could then be used in a tissue (i.e., object) detection step using watershedding. The inverted tissue boundaries were subtracted from the detected fluorescent tissue gene expression signals and used in all further analysis. The means of the fluorescent signals were compared using a two-sided Wilcoxon’s rank-sum test. If the expected signal-to-noise ratio between the detected gene expression signature and background signals was less than 3:1 new tissue optimizations are recommended.

Annotation patterns through manual image annotation and registration

To manually annotate tissue images based on their H&E features, we used a previously adapted graphical and cloud-based user interface26. We assigned each ST (x,y) coordinate with one or more regional tags. The region names used to annotate MOB were: granular cell layer (GR), outer plexiform layer (OPL), mitral layer (MI), internal plexiform layer (IPL) and glomerular layer (GL) and to annotate mouse cortex were: cerebral nuclei (CNU), cortical subplate (CTXsp), fiber tracts, hippocampal formation (HIP), hypothalamus (HY), isocortex (ISOCTX), midbrain (MB), piriform area (PIR) and thalamus (TH). For annotating spleen, we used four major areas: red pulp, B-follicle, marginal zone and periarteriolar lymphoid sheaths (PALS). To overlay tissue images through an image registration task, we used centroids of each annotated region as anchor points in the image translation and rotation tasks, as previously described32. This allowed us to display the data in a common coordinate system and to highlight genes and annotation areas of interest.

Comparisons between spatial gene expression profiles

For comparisons between the SM-Omics and ST datasets, reads were first downsampled to the same saturation level before invoking the ST pipeline mapper, annotator and counter run to receive UMIs per spatial (x,y) barcode. Depending on sequencing depth, a gene was counted as expressed if the corresponding transcript was present in more than 10−6 of the sequencing depth. The total count over all spots per gene and sample were then normalized65. Spearman’s correlation coefficient between the average and normalized samples was calculated using Scipy v1.2.066. To compare the performance of Visium and SM-Omics, we sequenced both libraries to an average depth of ~65 million paired end reads. For Visium, we sequenced 29 nt in the forward and 43 nt in the reverse read. Reads were downsampled to the same saturation level. Both datasets were processed using the ST pipeline as described above. Conventional GTF files used in the annotation step with HTseq-count were converted so that all transcript features now carried an exon tag used in counting transcripts. UMI collapsing was done using a naive approach and allowing for 1 low quality base present in either of the datasets. Unique molecular identifiers per measurement were calculated as described earlier.

To visualize the counts data per condition, total numbers of detected genes or UMIs were plotted as violin plots and summarized mean values for all replicate libraries overlayed as dotplots; similar as presented in Lord et al.67. To compare between different spatial RNA-seq protocol versions, we followed an approach similar to that previously described in Svensson et al.68. Raw data were first processed as described in the Saturation curve generation section, and each replicate (at least n = 3) from each condition (i.e., spatial RNA-seq protocol version) was represented by the counts mean67 at each of 9 different saturation points. Following processing, summarized counts data in each comparison were first scaled [0,1] and then used to estimate a generalized linear mixed model (glmm). We used a glmm (R package glmmTMB v1.1.1) modeled as a proportional binomial logit response between counts, protocol version (fixed effect) and replicate (random effect). Log proportions of annotated reads were used as offsets in the model. All glmm estimates were performed using the R stats package (v4.0.1) and Wald’s p-values reported.

Saturation curve generation

Number of unique molecules was calculated by subsampling the same proportion of mapped and annotated reads from each sample. First, each library was randomly down-sampled to three sequencing saturation points (defined as percentage of raw reads in a library) and numbers of UMIs or unique genes and annotated reads in a sample collected after running the ST Pipeline v.1.7.6 as described in the Raw reads processing and mapping section above. Using this information, we could solve the Lineweaver–Burk equation and accurately estimate the number of raw reads R in each sample s that are needed to reach a certain saturation level S in a given library:

$${R}_{s}=frac{{S}_{s}times {K}_{M}}{{V}_{{max }}-{S}_{s}},$$

(1)

where Vmax is the maximum saturation point and KM represents the number of raw reads at half of Vmax After randomly down-sampling all the libraries to the same library saturation, we considered this our maximum saturation point (100%) in all comparisons and sampled a total of 9 different points (0.001–100%) to be included in the saturation curve plots presented in this paper.

Quantitative immunofluorescence profiles per SM-Omics spot

First, we trained a random forest classifier using the Ilastik69 (v1.3.3) framework to extract probabilities of the positive class assignment ie. positive antibody signals from our IF mouse brain images. Separate classifiers were trained to each antibody used and a total of ~10 images with at least 10 fields of view were used in the training process. In each classifier, we used two labels for classification: signal and background. Respective full-sized fluorescent microscopy images were then processed and output probabilities used in the following steps. For spleen data, raw fluorescent images were used as input in the following steps. First, images were processed as described in Estimating reproducibility of SM-Omics in situ reactions. Calculated background was removed from each image, signal boundaries estimated using watershedding followed by creating a binary mask image. This mask was then overlaid with the original fluorescent image and this image was then used in all following steps. To quantify the fluorescent signal intensities per ST spot, the image was cropped into a 33 × 35 matrix creating smaller patches; each patch sized at ±1% image from the centroid of each ST spot. Finally, the intensity from each spot area was calculated as the sum of the fluorescent signal detected in that spot patch.

Spatial gene and antibody-based expression analysis

Statistical analysis of the spatial gene and antibody tag expression data was performed using Splotch’ one- or two-level hierarchical model as previously described32. In short, the model captures spatial expression in anatomical regions while accounting for experimental parameters such as, in our case, different animals, and calculates gene or antibody expression estimates for each single gene or antibody in each annotated spatial spot. To find targets which were differentially expressed in an annotated morphological region, we computed a one-vs-all comparison and took those values with a positive log Bayesian factor (BF). Posterior probabilities presented hereinafter normalized expression estimates and were used throughout the analyses presented. For scaling per annotated region, normalized expression values were first grouped by annotated region and then scaled from 0 to 1 within each sample. The correlation between gene expression and fluorescent signal was calculated in the same way, but the fluorescent signal matrix, prepared as explained in Calculating quantitative immunofluorescence profiles per SM-Omics spot, was used instead of the antibody tag counts matrix.

Comparison to Allen Brain Atlas data

To validate our findings, we downloaded in situ hybridization (ISH) gene expression data from the Allen Brain Atlas50 (https://mouse.brain-map.org/) with Image Credit: Allen Institute for Brain Science. The following gene expression images were used from the ABA as denoted with appropriate image and experiment identifiers: CTGF 478 [https://mouse.brain-map.org/experiment/show/79556634], CAMK4 474 [https://mouse.brain-map.org/experiment/show/75038464], LANCL3 474 [https://mouse.brain-map.org/experiment/show/73925716], CBLN4 476 [https://mouse.brain-map.org/experiment/show/72283804], NR2F2 466 and 250 [https://mouse.brain-map.org/experiment/show/112646890], NRSN1 478 [https://mouse.brain-map.org/experiment/show/71358557], NOS1AP 472 [https://mouse.brain-map.org/experiment/show/77280574], CDH23 469 [https://mouse.brain-map.org/experiment/show/72283805], PRSS12 474 [https://mouse.brain-map.org/experiment/show/71836879], CABP7 253 [https://mouse.brain-map.org/experiment/show/73930835], SEMA4G 266 [https://mouse.brain-map.org/experiment/show/71587856], DKKL1 237 [https://mouse.brain-map.org/experiment/show/70634395], SLC17A6 272 [https://mouse.brain-map.org/experiment/show/73818754] and PENK 262 [https://mouse.brain-map.org/experiment/show/74881286]. For comparisons in MOB samples, we used the following regions from ABA: GL, GR, MI and OPL. For comparison in cortex samples, we used the following regions from ABA: piriform-amygdalar area (PAA), postpiriform transition area (TR) in addition to CNU, CTXsp, HIP, HY, ISOCTX, MB and TH. Prior to enrichment analysis, genes found in PAA, TR and PIR in ABA were merged into one region name: PIR. We filtered genes with fold change >1 and expression threshold >2.5 in ABA and compared to genes with positive fold change and log(BF) in our Splotch data and computed a one-sided Fisher’s exact test using Scipy v1.2.066. FDR was estimated using the Benjamini-Hochberg70 procedure. Heatmaps denoting regions present in both conditions were plotted. One of the top most differentially expressed genes in both SM-Omics and ABA was chosen from each region and its expression visualized. A reference ST dataset24 was also analyzed using Splotch with the same settings as used for SM-Omics, visualized and compared to SM-Omics. To create correlations between ABA expression patterns and SM-Omics, Visium and ST expression patterns, normalized expression data was first grouped by annotated region and then scaled from 0 to 1 within each sample. To compare SM-Omics and ST, we compared top genes per MOB region: NRSN1, NOS1AP, CDH23 and PRSS12. To compare SM-Omics and Visium, we compared top genes per mouse brain cortex as found in ABA: ADORA2A, CABP7, SLC6A11, IER5, SLC17A6 and GREM2.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Source link