Incorporation of universal bases into Cas9 crRNAs enables targeting of polymorphic sequences
Past studies have shown that the inclusion of sugar11,40 and backbone41,42 chemical modifications in Cas9 crRNAs can be tolerated. In addition, crRNAs containing locked/bridged nucleic acids (LNA/BNA) and DNA have been demonstrated to reduce Cas9 off-target DNA cleavage activity relative to their unmodified counterparts11,40. Given these findings, we speculated that incorporation of non-canonical bases into crRNAs might also be permitted. In particular, we wondered if universal bases could be incorporated into crRNAs so as to enable Cas9 recognition of polymorphic target sequences. To test this possibility, we selected a highly polymorphic sequence from the ABO gene that determines the most clinically important blood group system in mammals43. We generated a series of 16 DNA target sequences (ABO-T1–16), derived from prevalent alleles in the human population, containing naturally occurring single nucleotide polymorphisms (SNPs) within that region (Fig. 1b, Supplementary Fig. 2a). Next, we tested the ability of Cas9 to cleave these sequences in vitro using an unmodified guide RNA (ABO-RNA) corresponding to the reference sequence (ABO-T1). Consistent with previous studies on Cas9 specificity11,44, we observed robust cleavage of the on-target sequence (ABO-T1) and two sequences containing single SNPs (ABO-T2, ABO-T4), but weak or absent activity on all of the other sequence variants (Supplementary Fig. 2b). These results reinforce the negative impact that natural genetic variation can have on Cas9 on-target activity.
To generate guide RNAs capable of simultaneously recognizing a broader set of ABO sequence variants, we selected two ABO sequences bearing 2 (ABO-T5) or 3 (ABO-T6) polymorphisms relative to ABO-T1 (Fig. 1b), and designed a panel of corresponding crRNAs in which ribose inosine (ABO-rI-1, ABO-rI-2), deoxyribose inosine (ABO-dI-1, ABO-dI-2), 2′OMe ribose inosine (ABO-mI-1, ABO-mI-2), deoxyribose 5′-nitroindole (ABO-dN-1, ABO-dN-2), deoxyribose K (ABO-dK-1, ABO-dK-2), or deoxyribose P (ABO-dP-1, ABO-dP-2) bases were substituted at positions overlapping with the SNPs (Fig. 1c, d). Using ABO-rI-1, ABO-dI-1, ABO-mI-1, ABO-dN-1, ABO-dK-1, and ABO-dP-1, we assayed Cas9 cleavage activity on ABO-T1, the corresponding ABO-T5 double SNP variant sequence, and sequences containing each SNP in isolation (ABO-T2, ABO-T4). Using the ABO-mI-1 and ABO-dK-1 crRNAs, Cas9 cleaved ABO-T5 > 5 and >10-fold more abundantly than with ABO-RNA, respectively (Fig. 1c, Supplementary Fig. 3a, b). Both of these crRNAs also supported Cas9 cleavage of the single variant (ABO-T2, ABO-T4) and reference (ABO-T1) sequences (Fig. 1c). Similarly, we found that ABO-rI-2, ABO-dI-2, ABO-mI-2, and ABO-dK-2 guided efficient Cas9 cleavage of ABO-T6 ( > 50% compared to 0% with ABO-RNA) (Fig. 1d, Supplementary Fig. 3a, c). ABO-rI-2 and ABO-mI-2 were also able to direct the cleavage of ABO-RNA and the single variant sequences ABO-T2 and ABO-T3, but not ABO-T4 (Fig. 1d). These results demonstrate that universal bases with diverse chemistries can be incorporated into crRNAs to allow simultaneous targeting of complex SNP variants in vitro.
Our findings in Fig. 1c and Fig. d indicated that amongst the various universal bases we tested, inosine derivatives (ribose, deoxy, 2′OMe) appeared to be the most consistently well tolerated in vitro. Therefore, we chose to focus our studies on this naturally occurring non-canonical base. Unlike synthetic bases such as deoxyribose 5′-nitroindole, previous work has shown that inosine exhibits a slight base pairing preference in certain contexts30. We wondered if a base pairing bias might manifest in our in vitro Cas9 DNA cleavage reactions. To test this, we designed two sets of 16 target sequences covering all combinations of bases at the two SNP locations in ABO-T5 and ABO-T7, and evaluated cleavage of these sequences by Cas9 using either ABO-rI-2 or the unmodified crRNA. As shown in Fig. 1e, Cas9 was able to cut 9 of the 16 targets with >25% efficiency using ABO-rI-2 (the remaining seven sequences were also cut at low levels), compared to only the reference sequence being cleaved to this extent using the unmodified crRNA. The results using the ABO-T7 derivative sequences were even more striking. All 16 of the derivative sequences were cleaved at >50% efficiency by Cas9 using ABO-rI-2, while only the reference sequence was cut at appreciable levels using the unmodified crRNA (three other sequences were cleaved at lower levels). These results suggest that incorporation of inosine bases into crRNAs enables targeting of all four canonical bases at the corresponding DNA target sites in a relatively unbiased and independent manner.
To characterize the patterns of inosine modifications permitted by Cas9, we synthesized an additional 13 crRNAs containing 1–4 ribose inosine modifications (Supplementary Fig. 4a) and tested the ability of these to direct cleavage of the ABO-T1 sequence by Cas9. We found that inclusion of a single inosine was tolerated in all instances, albeit with reduced activity, while crRNAs containing 2–4 inosine substitutions supported Cas9 cleavage of ABO-T1 in certain cases (Supplementary Fig. 4b). Next, we sought to determine the cause of the reduced activity and to establish if the effect was general or target-specific. Using ABO-RNA, and two crRNAs containing two inosine modifications, ABO-rI-1 (low activity) and ABO-rI-8 (no activity on the ABO-T1 sequence using the given conditions), we performed a titration of tracrRNA:crRNA to determine if inosines within the spacer sequence might somehow impair the ability of these two RNA elements to hybridize. Altering this ratio did not result in increased activity, ruling out this possibility (Supplementary Fig. 5a, b). In addition, the low cleavage activity observed in vitro using ABO-rI-1 or ABO-rI-8 could not be augmented by increasing reaction time (Supplementary Fig. 5c). Based on these results, we hypothesized that the lowered activity observed using certain inosine-modified crRNAs may be due to decreased ribonucleoprotein (RNP) complex binding to the target DNA sequence. A titration of activity versus RNP concentration provided evidence to support this assertion (Supplementary Fig. 5d). Moreover, data from electrophoretic mobility shift assays (EMSAs) confirmed that RNP binding to ABO-T1 was substantially reduced using ABO-rI-1 and ABO-rI-8 compared to the unmodified crRNA (Supplementary Fig. 6). Interestingly, we found that ABO-rI-8, which did not support Cas9 cleavage of ABO-T1, did support cleavage of ABO-T7, establishing its activity on other target sequences (Supplementary Fig. 7). In all instances, we observed a strong correlation between RNP-target engagement and activity in DNA cleavage assays (Supplementary Figs. 6, 7). Previous work has shown that I-G and I-A pairs decrease thermodynamic duplex stability by 0.84 kcal/mol and 0.52 kcal/mol compared to C-G and A-T pairs, respectively30. We found that in the absence of Cas9, Tm values for inosine-modified crRNA-target DNA duplexes were in fact reduced compared to the unmodified counterpart (Supplementary Fig. 8). Thus, it is likely that incorporation of inosines into crRNAs destabilizes Cas9-DNA target binding, although the extent to which this affects overall activity appears to be context-dependent and minimal in some cases.
Inclusion of universal bases into crRNAs alters the specificity only at the site of incorporation
A prerequisite for the practical application of guide RNAs containing universal bases to targeting SNPs is that they must alter Cas9 specificity in a localized and predictable manner. That is to say, they should impart selective degeneracy rather than globally impacting the precision of Cas9 DNA cleavage. To evaluate this, we employed a previously described high-throughput specificity profiling assay11,16,45 that measures Cas9 cleavage of a library of >1012 off-target sequences, containing a tenfold coverage of all sequences with ≤8 mutations relative to the ABO-T1 sequence (Fig. 2a). We performed the assay on the unmodified crRNA and all 15 of the ribose inosine-modified crRNAs listed in Supplementary Fig. 4a, as well as all of the crRNAs modified using alternative universal bases listed in Fig. 1c, d. We used the datasets for each crRNA to calculate enrichment scores for each base at each position within the ABO-T1 sequence and generated specificity heatmaps to visualize the results. For the collection of inosine-modified crRNAs, we found that in nearly all cases the specificity profile for the crRNAs containing universal bases was similar to that of ABO-RNA at all positions except those that overlapped with the locations of the universal bases (Fig. 2b, Supplementary Figs. 9–11). Moreover, substitution of the indicated bases with inosine rendered the crRNA virtually non-specific at that position (Fig. 2b), and was associated with changes in specificity scores ranging from approximately −0.6 to −1.0 at those sites (Supplementary Fig. 10). Similar results were observed from the analysis of the crRNAs bearing deoxyribose inosine, 2′OMe ribose inosine, deoxyribose 5′-nitroindole, deoxyribose K and deoxyribose P base modifications (Fig. 2, Supplementary Figs. 12–15). Overall, we found that specificity at the site of universal base incorporation was virtually abolished, while specificity at other locations appeared to be preserved, or even enhanced in certain cases (Supplementary Fig. 13). Substitution of the indicated PAM-distal uracil with ribose inosine (ABO-rI-1 and ABO-rI-2), deoxyribose inosine (ABO-dI-1 and ABO-dI-2), or 2′O methyl inosine (RNA-mI-1 and RNA-mI-2) rendered the crRNA non-specific at this position (Supplementary Fig. 13) and was associated with a difference in specificity score in excess of −0.6 (Supplementary Fig. 14). Similar results were observed when the indicated PAM-proximal cytosine base was replaced by a universal base, while specificity at the PAM-proximal guanine position was less affected, ostensibly due to an initial lack of specificity at this position in ABO-RNA (Supplementary Figs. 13, 14). Finally, to generalize our findings to other DNA target sequences, we synthesized a separate set of 8 crRNAs with inosine modifications at positions corresponding to SNPs present in a region of the major histocompatibility complex HLA gene. As shown in Supplementary Figs. 16–19, inclusion of inosine bases in this crRNA similarly abolished specificity in a site-restricted manner. Collectively, these data reveal that inclusion of universal bases in crRNAs imparts selective degeneracy at the site of incorporation without otherwise altering specificity, and that this effect extends to compositionally distinct DNA targets.


a Diagram depicting the workflow for the high-throughput specificity profiling assay. b Heatmaps corresponding to the specificity profiles of the indicated ribose inosine-modified crRNAs. The positions of inosine bases are indicated by black arrows. Specificity scores of 1.0 (dark blue) correspond to 100% enrichment for, while scores of −1.0 (dark red) correspond to 100% enrichment against a base-pair at a specific position. Black boxes denote the intended target nucleotide.
crRNAs containing universal bases can direct Cas9 cleavage of polymorphic sequences in cells, but with limitations
Knowing that inclusion of universal bases in crRNAs could impart selective degeneracy while broadly maintaining cleavage specificity in vitro, we sought to determine if our results could be translated to cells. As an initial test, we adapted a plasmid-based fluorescence reporter system46 and used it to measure the cleavage of eight heterologous ABO sequences in cells. First, we selected ABO-rI-2, which bears three inosine modifications, and tested its ability to direct Cas9 cleavage of ABO-T1, the corresponding triple SNP variant (ABO-T6), three double SNP sequences (ABO-T5, 7, 8) and three single SNP sequences (ABO-T2, T3, T4) in vitro (Fig. 3a). We found ABO-rI-2 directed >50% cleavage of 6/8 sequences tested, the exceptions being ABO-T4 (~20%) and ABO-T5 (<10%) (Fig. 3b). In contrast, ABO-RNA only supported robust Cas9 cleavage of >50% of its matched sequence (ABO-T1) and ABO-T2 (Fig. 3b). Next, we cloned all of the target DNAs into a plasmid in which sequences were flanked by an in-frame mRFP gene at the 5′ end and two out-of-frame eGFP genes at the 3′ end (Fig. 3c). Past work has shown that double-strand breaks formed in the intervening target sequence can be repaired by non-homologous end-joining (NHEJ), resulting in frameshift mutations that generate a multifluorescent mRFP-eGFP fusion protein (Fig. 3c)46. We co-transfected all eight constructs with either ABO-RNA or ABO-rI-2 into HeLa cells stably expressing Cas9 and used fluorescence-activated cell sorting (FACS) to quantify the resulting cell populations (Fig. 3d). Using the ABO-RNA, Cas9 cleaved 3/8 sequences with >20% efficiency (ABO-T1, ABO-T2, ABO-T4). However, 7/8 sequences were cleaved with >20% efficiency, and ABO-T4 RNA was cleaved at 16% efficiency when the ABO-rI-2 guide RNA was used (Fig. 3e).


a List of ABO variant DNA target sequences (ABO-T1-T8) assayed in cells. Positions of SNPs are indicated with red lettering. The PAM sequence is underlined. b Bar graphs showing the relative amount of DNA cleavage resulting from in vitro reactions containing Cas9 with ABO-RNA or ABO-rI-2 versus the indicated DNA target sequences. Assays were performed using fixed concentrations of gRNA (80 nM) and Cas9 (40 nM); Mean with individual data points shown (n = 2 independent experiments). c Schematic outlining the framework for the fluorescence-based assay used to evaluate cleavage of the ABO variant target sequences in cells. d Representative FACS plot showing the distribution of RFP and GFP positive cells. Dual positive cells appear in the top right quadrant. e Table showing normalized %GFP + /all %RFP + events corresponding to cleavage of the indicated target sequences in cells by Cas9 using either ABO-RNA or ABO-rI-2; Mean ± S.D. shown (n = 3 independent samples).
To examine the utility of this approach for targeting endogenous sites in cells, we sequenced several loci containing PAM sequences in 293 T and HeLa cells that were predicted to contain SNPs based on the Ensembl47 and HEK293T48 reference genomes. We identified a homozygous sequence within the HLA-C gene differing at 2 base positions between 293 T and HeLa cells (Supplementary Fig. 20a). We generated a crRNA corresponding to the HLA-C sequence in 293 T cells (HLA-C-RNA), and verified its ability to direct Cas9 cleavage of the HLA-C gene in 293 T cells (HLA-C-T1) but not HeLa cells (HLA-C-T2) (Supplementary Fig. 20b). We then synthesized crRNAs containing ribose inosine, deoxyribose inosine, or deoxyribose P bases at positions overlapping with the locations of the mismatches in the HLA-C-T1 and HLA-C-T2 sequences (Supplementary Fig. 20c). We tested the ability of these universal base-modified crRNAs and the unmodified crRNA to direct Cas9 cleavage of the two target sequences in vitro. We found that Cas9 was able to robustly cut both the HeLa and 293 T HLA-C sequences when the rI, dI, and dP base-modified crRNAs were used (Supplementary Fig. 20d). The unmodified crRNA induced ~60% cleavage of its corresponding target (HLA-C-T1), but only ~30% when HLA-C-T2 was used as a substrate (Supplementary Fig. 20d). The fact that Cas9 cleavage of this off-target sequence was absent in cells using the unmodified crRNA is consistent with previous reports in literature showing higher stringency against off-target cutting in cells11,40,49. Finally, we tested the ability of the unmodified and the rI-, dI-, and dP-modified crRNAs to direct Cas9 cleavage of the HLA-C locus in 293 T and HeLa cells. As previously observed, we found that the HLA-C-RNA supported Cas9 cleavage of the HLA-C locus in 293 T cells (~40%) but not HeLa cells (0%) (Supplementary Fig. 20e). In contrast to our in vitro findings, HLA-C-rI and HLA-C-dI showed either weak or undetectable Cas9 cleavage activity in both 293 T and HeLa cells. However, HLA-C-dP was able to direct Cas9 cleavage of the HLA-C locus in both 293T (~40%) and HeLa cells (~12%). Collectively, these results demonstrate the potential for universal base-modified crRNAs to drive Cas9 cleavage of polymorphic sequences in cells, but also reveal some limitations to their general use.
We wondered if the discrepancy between the activity of the HLA-C-rI and HLA-C-dI crRNAs in vitro and in cells could be the result of delayed Cas9 cleavage kinetics. Previous work has shown that modification of the ribose sugar in crRNAs can lead to slower enzyme kinetics that manifests as reduced activity in cells11. To test this hypothesis, we performed a Cas9 cleavage time course on DNA substrates corresponding to either HLA-C-T1 or HLA-C-T2 using HLA-C-RNA or HLA-C-rI, -dI, or -dP crRNAs. As shown in Supplementary Fig. 21, we found that Cas9 cleavage of HLA-C-T1 using HLA-C-rI or HLA-C-dI crRNAs was slower than with HLA-C-RNA or HLA-C-dP by a factor of ~4 fold. Furthermore, we found that Cas9 cleavage of the HLA-C-T2 substrate using HLA-C-dP was substantially quicker than cleavage using the HLA-C-RNA, -rI, and -dI crRNAs (Supplementary Fig. 21). This strong correlation between cellular modification rates and in vitro kinetics suggests that delayed enzyme kinetics could underlie the low activity of the HLA-C-rI and HLA-C-dI crRNAs in cells.
DETECTR probes containing universal bases identify evolved variants of a pathogen
In addition to its use as a gene-editing agent, Cas12a/Cpf1 has also successfully been harnessed for diagnostic purposes as part of the DETECTR system13. Point-of-need technologies using this platform to diagnose swine flu50 as well as COVID-1951 have now been deployed. However, the prospect of viral evolution presents a unique challenge for the identification of these pathogens, as mutations could subvert detection by Cas12a guide probes designed to target only reference sequences, leading to false negative results. We hypothesized that inosine bases could be incorporated into Cas12a guide RNAs to impart them with selectively degenerate targeting capabilities in order to circumvent this limitation. To test this possibility, we selected a DNA sequence from the HIV-1 protease gene and identified seven clinically-relevant sequence variants bearing 1, 2, or 3 SNPs encoding mutations that confer resistance of the virus toward HIV protease inhibitor drugs52,53 (Fig. 4a). Next, we synthesized two crRNAs, HIV-RNA to target the canonical sequence, and HIV-rI-1, which contains three inosine substitutions designed to enable flexible targeting of both the canonical and evolved variant sequences. An in vitro cleavage assay of all target sequences using HIV-RNA with Cas12a revealed that HIV-T1, HIV-T3, HIV-T4, HIV-T7 were cleaved at efficiencies of 55%, 30%, 55%, and ~30%, respectively (Fig. 4b, c). In stark contrast, all eight sequences were fully cleaved when HIV-rI-1 was used as the guide RNA (Fig. 4b, c), supporting our assertion and revealing a high degree of tolerance for the presence of inosine substitutions in Cas12a guide RNAs. To ensure that the lack of cleavage activity observed with HIV-RNA on sequences such as HIV-T8 was not simply due to insufficient RNP, we performed titrations of RNP concentration. Consistent with our model, we found that overall Cas12a cleavage activity (combined cis and trans) was comparable between HIV-RNA and HIV-rI-1 using the HIV-T1 substrate (Supplementary Fig. 22a, b). However, HIV-RNA was unable to direct cleavage of HIV-T8, in contrast to HIV-rI-1, which induced complete cleavage of this substrate at an RNP concentration of ~25 nM (Supplementary Fig. 22c, d). Subsequently, we ported these probes into the DETECTR system, outlined in Fig. 4d. To simulate pathogen DNA, we cloned each of our eight target sequences into pUC19 plasmids and performed recombinant polymerase amplification (RPA) as described in the protocol13. Next, we set up individual reactions containing each DNA sample paired with either HIV-RNA or HIV-rI-1 probes in the presence of a fluorescent detection substrate. As shown in Fig. 4e, the HIV-rI-1 probe positively identified all eight of the HIV-1 variant sequences, while the HIV-RNA probe only identified three sequences and provided false negatives for the other five variants. These findings provide justification for the use of universal base-modified crRNAs in CRISPR-based diagnostic platforms.


a List of DNA target sequences derived from the HIV-1 protease gene containing evolved SNPs detected in patient samples. SNPs position(s) are indicated with red lettering. The PAM sequence is underlined. b Bar graphs showing the relative amount of DNA cleavage resulting from in vitro reactions containing Cas9 with HIV-RNA or HIV-rI-1 versus the indicated DNA target sequences. A scrambled crRNA with the sequence 5′-AUUCUUGCUCUGCUCUCUUCGUC-′3 was used as a negative control. Assays were performed using fixed concentrations of crRNA (125 nM) and Cas12a (100 nM); Mean with individual data points shown (n = 2 independent experiments). c Representative gels of the in vitro cleavage assay results for Cas12a with HIV-RNA or HIV-rI-1 versus the indicated DNA target sequences. The bottom two bands in the gel represent the cleaved DNA substrate while the top band corresponds to the undigested substrate. Cleavage experiments were performed in duplicate with similar results. d Diagram outlining the DETECTR assay. e Bar graph indicating the fluorescence signal obtained in the DETECTR assay using Cas12a in combination with either HIV-RNA or HIV-rI-1 and samples containing the indicated target sequences. Max fluorescence values were normalized to background; Mean with individual data points shown (n = 3 independent experiments).
Viral escape due to mutation of the target site to a variant is a major roadblock to using CRISPR therapeutics as antivirals54. Based on our results demonstrating effective targeting of polymorphic sequences using HIV-rI-1 in vitro, we wondered if this crRNA could direct Cas12a cleavage of variant viral sequences in cells. To test this possibility, we used the Flp/FRT system55 to stably integrate single copies of the HIV-T1 and HIV-T8 sequences into 293 cells (Supplementary Fig. 23a, b). We found that the unmodified HIV-RNA crRNA directed robust Cas12a cleavage of the HIV-T1 site (~28%) but virtually no cleavage at the HIV-T8 site (<3%) (Supplementary Fig. 23c). In contrast, the HIV-rI-1 RNA induced cleavage of both sites with relatively equal efficacy (HIV-T1: 6%, HIV-T8: 8%) (Supplementary Fig. 23c). This corresponds to a change in HIV-T1:HIV-T8 cleavage preference of >12-fold. Importantly, we did not detect any DNA cleavage using either crRNA at two predicted genomic off-target sites (Supplementary Fig. 23d, e). These data demonstrate that crRNAs containing inosine modifications can be used in combination with Cas12a to cleave polymorphic sequences in cells, albeit with reduced activity.

