Preloader

FrCas9 is a CRISPR/Cas9 system with high editing efficiency and fidelity

FrCas9 edits distinct 5′-NNTA-3′ PAM

By bioinformatic screening, we identified the Type II-A system in the genomes of Faecalibaculum rodentium. The phylogenetic analysis showed that FrCas9 is dissimilar to SpCas9 at a distance of 1.80 (Fig. 1a and Supplementary Table 1), indicating it may possess different characteristics. The FrCas9 locus included a CRISPR array composed of 31 spacer-direct repeat units (Supplementary Table 2), locating adjacent to cas1, cas2, csn2 and cas9 genes (Fig. 1b). By searching sequences complementary to the direct repeats21, we identified a 71 nt tracrRNA sequence (not included the poly T) upstream the cas9 gene (Fig. 1b and Supplementary Table 3). Next, we predicted 7 catalytic residues in FrCas9 HNH and RuvC domains (Fig. 1b and Supplementary Table 4).

Fig. 1: FrCas9 edits distinct 5′-NNTA-3′ PAMs.
figure 1

a The phylogenetic tree of FrCas9 and 13 active Cas9 orthologs. b The schematic of Faecalibaculum rodentium systems. Insert above displayed the domains of FrCas9 with active residues indicated with asterisks. I, II, III represented three RuvC domains. Insert below showed expressed crRNA and tracrRNA from small RNA-seq of E. coli harboring pET-28A plasmid with simplified FrCas9 locus. c Distribution of the length of spacer sequences derived from small RNA sequencing. d Distribution of the length of DR sequences derived from small RNA sequencing. e The schematic of depletion assay and web-logo results for SpCas9 and FrCas9. f The deletion effects of first four nucleotides of FrCas9 PAM. Data are presented as mean values ± S.E.M. (n = 4). g The plasmid interference assay of FrCas9 in 4 sites that differed in the 2nd PAM positions. A series of dilution was performed. h The bar plot of cell units of above plasmid interference assay. Source data are provided with this paper.

To confirm the prediction of tracrRNA, we validated its expression from small RNA-seq of E. coli that contained simplified FrCas9 locus. The tracrRNA sequences were actively transcribed in an opposite orientation to the cas genes, while the crRNA sequences were transcribed in the same direction as the cas genes (Fig. 1b). Further analysis indicated that the spacers and direct repeats of mature crRNA products mainly distributed in 21-22 nt and 20 nt, respectively (Fig. 1c, d).

Next, we conducted a depletion assay to determine PAM requirement of FrCas922. A plasmid library containing a 30 bp target site following 6 bp random PAM sequences was constructed and electroporated into E. coli containing FrCas9 plasmid (Fig. 1e and Supplementary Table 5). Only the targets meeting PAM requirements would be depleted from E. coli. The weblogo and PAM wheel showed that FrCas9 may have potential 5′-NRTA-3′ PAM (Fig. 1e and Supplementary Fig. 1, N: A, T, C, G; R: A, G). Repeated depletion assay (n = 4) revealed that the 2nd PAM position had slight preference for G (mean log2 fold-change = 0.21) or A (mean log2 fold-change = 0.11) (Fig. 1f). Then, we specifically validated the 2nd PAM preference by the plasmid interference assay23. Compared to control vectors, four 5′-NNTA-3′ groups all displayed inhibited cell growth under the dual-antibiotic screening (Fig. 1g and h), indicating that there was no 2nd preference in the PAM of FrCas9. Together, FrCas9 edited DNA sequences with a 5′-NNTA-3′ PAM in prokaryotic cells.

FrCas9 is active in eukaryotic cells

To test the genome editing activity of FrCas9 in mammalian cells, we joined the whole 71 nt tracrRNA and the 42 nt crRNA as a single guide RNA (sgRNA) by GAAA-tetraloop3. Then, above sgRNA and synthesized human codon optimized FrCas9 sequences were cloned into the PX330 plasmid vector.

We developed the puromycin depletion assay to confirm the PAM sequence of FrCas9 in HEK293T cell lines (Fig. 2a and Supplementary Table 5). As a positive control, the 5′-NGG-3′ PAM of SpCa9 validated our workflow (Fig. 2b). FrCas9 had an obvious preference for 3rd T (Log2 fold-change = 0.17) and 4th A (Log2 fold-change = 0.23), validating the 5′-NNTA-3′ PAM requirements in prokaryotic cells (Fig. 2c). To further characterize the PAM preference of FrCas9 in living human cells, we conducted high-throughput PAM determination assay (HT-PAMDA)24, which showed that FrCas9 had a canonical 5′-NNTA-3′ PAM with scattered NNTG, NNAN and NNGT non-canonical PAMs (Supplementary Fig. 2b).

Fig. 2: FrCas9 is active in eukaryotic cells.
figure 2

a The schematic of puromycin depletion assay. b, c The PAM results of SpCas9 and FrCas9 from the puromycin depletion assays. Data are presented as mean values ± S.E.M. (n = 3 biological independent replicates). d The TIDE assay showed FrCas9-induced indel rates in 3 human genes by 12 sgRNAs, which differed in the 2nd PAM base. e Genome editing by FrCas9 in the human RNF2 gene, validated by double-stranded oligodeoxynucleotide (dsODN) breakpoint PCR. And the Sanger sequencing was in Supplementary Fig. 2. Uncropped gel image is provided in Source Data. f The schematic representation of the sgRNA:target DNA complex. g The efficiency of sgRNAs with truncated scaffolds of FrCas9 assayed by target amplicon sequencing. Data are presented as mean values ± S.E.M. (n = 3 biological independent replicates). h The FrCas9 editing efficiencies with 19–23 bp spacer lengths in three human sites by target amplicon sequencing. i The FrCas9 editing efficiencies with 21 and 22 bp spacer lengths in five human sites by TIDE (n = 3 biological independent replicates). Source data are provided with this paper.

To further confirm the editing rates on 5′-NNTA-3′ PAM sequences in human cells, we first tested the genome editing ability of FrCas9 targeting 12 sites in 3 human endogenous genes (Supplementary Table 5). The 12 sites included all four nucleotides in the 2nd position of PAMs (NGTA, NATA, NCTA, NTTA). The TIDE assay showed that all sites with cleavage activities to various extents (Fig. 2d). Further, we validated the 5′-NNTA-3′ editing events at additional 32 sites in 8 gene loci, indicating the FrCas9 had efficient cleavage on 5′-NNTA-3′ PAM sequences in human cells (Supplementary Fig. 3). After the double-stranded breaks (DSBs), Cas9s commonly result in the non-homologous end joining (NHEJ) repair events, for instance, incorporating end-protected double-stranded oligonucleotides (dsODNs), which was used in GUIDE-seq4. We further detected the incorporation of dsODNs of 4 sgRNAs targeting the RNF2 gene through dsODN-PCR25, and successfully verified the FrCas9 on-target modification (Fig. 2e and Supplementary Fig. 4). Therefore, FrCas9 required a 5′-NNTA-3′ PAM in eukaryotic cells.

The optimal sgRNA designs for FrCas9

Previous studies showed that the sgRNA consisting of full-length crRNA and tracrRNA may not achieve the best editing efficiency5,26. Therefore, we further investigated the optimal sgRNA architecture of FrCas9 by truncating both the 3′ terminal of crRNA (referred as repeat:antirepeat truncation) and tracrRNA (referred as tracrRNA truncation) (Fig. 2f). The ability of the truncated sgRNAs to generate indels in human RNF2 gene (Supplementary Table 5) was tested and compared with SpCas95. Similar to other Cas9 orthologs5,15, our data showed that the truncations of 3′ tracrRNA tails decreased the efficiency of FrCas9 dramatically (Fig. 2g). Meanwhile, the truncation of 3′ crRNA affected the cleavage ability modestly when truncated from +18 to +12 (Fig. 2g). The gene editing by FrCas9 at other two endogenous sites (GRIN2B-T3 and GRIN2B-T9) confirmed above conclusions (Supplementary Fig. 5). Notably, in RNF2 site, the editing efficiency of FrCas9 was higher than that of SpCas9 (Fig. 2g, 70.08% vs. 38.36%).

Next, to test the optimal length of FrCas9 guide RNAs, we designed the 19–23 bp guide sequences at three sites (Fig. 2h and Supplementary Table 5). All 5 lengths effectively edited the targets, but the highest genome editing efficiencies were achieved in 21-22 bp (Fig. 2h, DNMT1-T3 21 bp = 21.0%, RNF2-T6 21 bp = 54.39%, HEK293 SITE-T2 22 bp = 34.45%). To compare the editing efficiencies of 21 bp and 22 bp, we involved additional 5 targets and observed that 22 bp showed better editing efficiency than 21 bp at 3 sites (Fig. 2i, GRIN2B-T6, 8.95% vs. 7.3%; DNMT1-T2, 2.9% vs. 1.2%; ANAPC15, 2.1% vs. 1.2%). Therefore, the optimal length of FrCas9 guide RNA was 22 bp, which was in consistent with the small RNA-seq results (Fig. 1d).

The tolerance of mutations in guide sequences was related to the specificity of Cas9 orthologs. Next, we investigated the sgRNA specificity of FrCas9 using 22 sgRNAs, all of which contained a single nucleotide mutation (Supplementary Fig. 6 and Supplementary Table 5). The results showed that mutated sgRNAs had no obvious cleavage, suggesting that FrCas9 has long seed region and possesses high specificity (Supplementary Fig. 6).

FrCas9 genome editing shows high specificity and activity

Since SpCas9 and FrCas9 both have 2-nuleotide PAMs (SpCas9: 5′-NGG-3′, FrCas9: 5′-NNTA-3′), we compared their genome editing efficiency and specificity in sequence with 5′-GGTA-3′. Based on this principle, we selected 11 human endogenous sites with 5′-GGTA-3′ PAM (Supplementary Table 6) and compared the cutting efficiency and off-target effect of SpCas9 and FrCas9 using GUIDE-seq4.

First, we detected dsODN integration in all 11 sites by dsODN-breakpoint PCR (Supplementary Fig. 7). The GUIDE-seq experiments confirmed the most frequent locations of dsODN incorporation of FrCas9 and SpCas9 were both the 3rd or 4th base upstream of PAM (Fig. 3a, FrCas9 3rd: n = 5, 4th: n = 5; SpCas9, 3rd: n = 3, 4th: n = 4). Totally, among the 11 sites, only one FrCas9 off-target was detected in GRIN2B-T3 site, while 2-3 SpCas9 off-targets per sgRNA were detected in HEK293 SITE-T2, DYRK1A-T2, GRIN2B-T8, GRIN2B-T9 sites (Fig. 3d). Importantly, we observed a significant higher on:off ratio (defined as on:off target reads) of FrCas9 than SpCas9 at 11 sites (Fig. 3e, P < 0.05, paired Wilcox test). The above results indicated that FrCas9 genome editing showed high specificity and activity. The same trend was also observed in U2OS cell line (Supplementary Fig. 8).

Fig. 3: The genome-wide specificities of FrCas9 and SpCas9.
figure 3

a The start mapping positions of GUIDE-seq reads for SpCas9 and FrCas9 targeting GRIN2B-T9. The 1st bases of PAM sequences (red) were position 0 and the most frequent dsODN incorporation sites were colored in red. b The off-targets of SpCas9 and FrCas9 for 11 sites, generated by GUIDE-seq in HEK293T cells. The sgRNA and PAM ranges of SpCas9 (20 nt sgRNA and 3 nt PAM) and FrCas9 (22 nt sgRNA and 4 nt PAM) were marked. GUIDE-seq read counts of each site were shown on the right side. c Summary of GUIDE-seq on-target reads of SpCas9 and FrCas9 at the above 11 sites. d Summary of off-target counts of SpCas9 and FrCas9 at the above 11 sites. e The on:off ratio of GUIDE-seq reads. N = 11 sites. Box plots indicate median (middle line), 25th, 75th percentile (box) and 5th and 95th percentile (whiskers). P = 0.032, two-sided Student’s t-test, * representing P < 0.05. f, g The off-targets of FrCas9, SpCas9, SpCas9-HF1, HiFi-Cas9 and eSpCas9 in DYRK1A-T2 (f) and GRIN2B-T9 (g) detected by GUIDE-seq in HEK293T cell line. h The on:off ratio of FrCas9, SpCas9, SpCas9-HF1, HiFi-Cas9 and eSpCas9 in DYRK1A-T2 and GRIN2B-T9. N = 5 Cas9 nucleases. Box plots indicate median (middle line), 25th, 75th percentile (box) and 5th and 95th percentile (whiskers). The ratio is defined by GUIDE-seq on-target reads dividing total off-target reads. Source data are provided with this paper.

Further, we compared the efficiency and specificity of FrCas9 with SpCas9 and its high-fidelity version (SpCas9-HF1, HiFi-Cas9 and eSpCas9) on DYRK1A-T2 and GRIN2B-T9 sites. In DYRK1A-T2 site, the off-targets of each variant were FrCas9 (0), SpCas9 (15), SpCas9-HF1 (2), HiFi-Cas9 (2) and eSpCas9 (1), respectively. The off-targets in GRIN2B-T9 site were as below: FrCas9 (0), SpCas9 (8), SpCas9-HF1 (6), HiFi-Cas9 (12) and eSpCas9 (7) (Fig. 3f, g). As expected, the FrCas9 exhibited the highest on:off ratio in both sites (Fig. 3h).

FrCas9 can be used to target HPV genomes

We next set out to test the efficiency and specificity of FrCas9 as a potential in targeting HPV genomes27. For HPV 18 genome, we observed a nearly 100% target coverage of FrCas9 with sgRNA distributed per of 5.65 bp in average (Fig. 4a, b), which was greater than SpCas9 (81.17% coverage and 8.32 bp mean distances of sgRNA distribution).

Fig. 4: FrCas9 is promising in anti-HPV18 treatments.
figure 4

a The target coverages of FrCas9 and SpCas9 in HPV18 genomes. b The sgRNA distribution densities for FrCas9 in HPV18 genomes. c The sgRNA distribution densities for SpCas9 in HPV18 genomes. d The 19 sites distribution in HPV18 genome for GUIDE-seq comparisons between FrCas9 and SpCas9. e The schematic illustration and GUIDE-seq results of SpCas9 and FrCas9 targeting URR, E6 and E7 genes. The reads number was normalized. f The GUIDE-seq normalized on-target reads of FrCas9 and SpCas9 in 19 sites. g The GUIDE-seq off-target counts of FrCas9 and SpCas9 in 19 sites. h The on-target vs. off-target ratios of FrCas9 and SpCas9 (N =19 sites, P = 0.0000267, two-sided paired Wilcox test, **** representing P < 0.0001). Box plots indicate median (middle line), 25th, 75th percentile (box) and 5th and 95th percentile (whiskers). The ratio is defined by GUIDE-seq on-target reads dividing total off-target reads. i The FrCas9 induced cell apoptosis in HPV18 positive HeLa cell line. Source data are provided with this paper.

To evaluate the efficacy of SpCas9 and FrCas9 in HPV 18, we conducted GUIDE-seq in 19 HPV sites and each site contained overlapping sgRNAs for SpCas9 and FrCas9 (Fig. 4d, e). All 19 sites showed editing activities with both SpCas9 and FrCas9 and the efficiencies were comparable between two Cas9 (Fig. 4f). SpCas9 sgRNAs had average 35.58 off-targets per sgRNA while FrCas9 sgRNAs had average 1.68 off-targets per sgRNA (Fig. 4g). Based on the on/off-target ratios represented by the GUIDE-seq reads, FrCas9 exhibited high efficiency and specificity in HPV 18 gene editing (Fig. 4h, P < 0.0001, two-sided paired Wilcox test).

Further, we selected the sgRNAs targeting HPV URR and E7 to investigate the apoptosis induced by FrCas9 in HPV 18 positive HeLa cell line. Compared to the 13.74% apoptosis rate of the negative controls, FrCas9 with the URR and E7 sgRNAs achieved 26.89% and 36.7% apoptosis rates, respectively (Fig. 4i). Therefore, FrCas9 can be used in targeting HPV 18.

FrCas9 has characteristics for wide applications in genome engineering

Since the 5′-NNTA-3′ PAM of FrCas9 has distinct targets for correcting human pathogenic variants, we repurposed FrCas9 for application of base editing. First, three point-mutations of E796A, H1010A and D1013A were respectively incorporated to generate different FrCas9 nickases (nFrCas9)28 (Supplementary Fig. 9a). Then, we combined E796A nFrCas9 with the optimized fourth-generation cytidine base editor BE4Gam29 and seventh-generation adenine base editor ABE7.1030. We observed that the editing window of FrCas9-BE4Gam was 6th–10th bases and that of FrCas9-ABE7.10 was 6th–8th bases (Supplementary Fig. 9b, c). Based on the above characteristics, we calculated the targeting scopes of FrCas9-BE4Gam and FrCas9-ABE7.10 in ClinVar databases31. For pathogenic mutations that could be precisely corrected by FrCas9-BE4Gam, 90.38% (235/260) events were different from SpCas9-BE4Gam. For pathogenic mutations that could be precisely corrected by FrCas9-ABE7.10, 92.21% (1196/1297) events were different from SpCas9-ABE7.10 (Fig. 5a). Therefore, the TA-rich PAM greatly expanded the targets in human genome for base-editor to correct human disease-associated mutations.

Fig. 5: The wide applications of FrCas9 due to its 5′-NNTA-3′ PAM.
figure 5

a The veen diagram of pathogenic mutations in ClinVar database that could be corrected by SpCas9 and FrCas9 base editors. b The C > T transition efficiencies of FrCas9-BE4Gam using 2 “back-to-back” sgRNA at the same time. The efficiencies were assayed by amplicon sequencing (n = 2 biological independent replicates). c The 5′-GG-3′ (represented for SpCas9 PAM) and d The 5′-TA-3′ (represented for FrCas9 PAM) distribution in the GRCh38 human genome. e The schematics of FrCas9 targeting TATA-boxes of ABCA1 gene. CRISPRi f and CRISPRa g of FrCas9 and SpCas9 by targeting TATA-boxes. The experiments were conducted in HEK293T cells and expression was quantified by qPCR. Data are provided as mean value ± S.D (n = 3 biological independent replicates). The p values of Control vs. FrCas9/TATA-box sgRNAs on ABCA1, UCP3, and RANKL were 0.0010, 0.017, 0.00038, respectively. The p values of Control vs. dFrCas9/TATA-box sgRNAs on ABCA1, UCP3, and RANKL were 0.00024, 0.015, 0.00098, respectively. The p values of Control vs. dFrCas9-VP64 forward sgRNA, dSpCas9-VP64 sgRNA1 vs. dFrCas9-VP64 forward sgRNA and dSpCas9-VP64 sgRNA1 vs. dFrCas9-VP64 reverse sgRNA on ABCA1 were 0.021, 0.026, 0.042, respectively (*P < 0.05, **P < 0.01, ***P < 0.001, two-sided Student’s t-test). Source data are provided with this paper.

Notably, the PAM of FrCas9 (5′-NNTA-3′) is palindromic, which offers pairwise “back-to-back” existence of sgRNAs (Fig. 5b). This feature could broaden the scopes of FrCas9 base-editors by modifying two close editing windows at the same time (Fig. 5b) and increase the target distribution and density of FrCas9 sgRNAs. We calculated the 5′-GG-3′ (represented for SpCas9 PAM) and 5′-TA-3′ (represented for FrCas9 PAM) distributions in human genomes (Fig. 5c, d). Compared to SpCas9 (median = 5 bp, mean = 8.66 bp)5, FrCas9 showed more intensive distributions (median = 1 bp, mean = 6.16 bp) in human genomes, providing additional applicable loci.

Interestingly, the TA palindromic PAM also has multiple sites on TATA box (Fig. 5e), which is a crucial promoter element for eukaryotic organisms32,33. We tested FrCas9 CRISPR interference (CRISPRi) in three TATA-box promoted genes, ABCA1, UCP3 and RANKL (Fig. 5e). By cleaving the TATA-box, FrCas9 reduced ABCA1, UCP3 and RANKL expression by 31.37%, 49.91% and 39.62%, respectively. Meanwhile, the expression of ABCA1, UCP3 and RANKL decreased 61.67%, 45.61% and 42.60% by dFrCas9 directly binding to the TATA-box, respectively (Fig. 5f). Together, FrCas9 possesses potential for efficient genome engineering of TATA-box related genetic diseases.

Further, we tested FrCas9 CRISPR activation (CRISPRa) using dFrCas9-VP64 directly targeting the TATA-box, and compared its performance with dSpCas9-VP64 targeting the upstream of TATA-box. The CRISPRa experiments were conducted in ABCA1, SOD1, GH1 and MBL2 genes. The results showed that dFrCas9-VP64 enables effective transcriptional activation. Moreover, the fold activation of dFrCas9-VP64 in ABCA1, GH1 and MBL2 was higher than that of dSpCas9-VP64, while the fold activation of SOD1 gene was comparable to that of dSpCas9-VP64 (Fig. 5g). Therefore, FrCas9 is a promising tool for CRISPR screening due to its 5′-NNTA-3′ PAM.

Source link