Preloader

Characterization and engineering of Streptomyces griseofuscus DSM 40191 as a potential host for heterologous expression of biosynthetic gene clusters

Genome mining and comparative analysis

Streptomyces griseofuscus DSM 40191 (= NRRL B-5429) was received from the German Collection of Microorganisms and Cell Cultures. We re-sequenced and assembled the genome of S. griseofuscus de novo33. It consists of a linear 8,721,740 bp chromosome and three plasmids: pSGRIFU1 (220 kb), pSGRIFU2 (88 kb), and pSGRIFU3 (86 kb) (Fig. 1).

Figure 1
figure1

Overview of S. griseofuscus strains and mutations. The positions of genomic inverted repeats are highlighted in blue, BGCs in black and transposases in green. The positions of the identified mutations are highlighted in red. The mutations detected in the wild type Illumina dataset can be considered as a technical noise. The position of the CRISPR-cBEST introduced STOP-codon is indicated with a black triangle. The strains p057_0D and p057_20D relate to the long term cultivation experiment, in which CRISPR-cBEST generated strain S. griseofuscus IHEP81_06602 (p057_0D), that contains an introduced STOP-codon in BGC 30, was transferred 20 consecutive times in liquid ISP2 media without selective pressure, thus generating strain p057_20D. The alignment was created with CLC Genomics Workbench 12.0.3 https://digitalinsights.qiagen.com/ and visualised with Adobe Illustrator 23.0.6 https://www.adobe.com/products/illustrator.html.

Whole genome comparison between S. griseofuscus DSM40191, S. coelicolor A3(2), S. venezuelae ATCC 10712

We analyzed how much genomic content is shared between S. griseofuscus DSM40191 and the well-studied model Streptomyces strains, S. coelicolor A3(2) and S. venezuelae ATCC 10712. Their genomes were downloaded from NCBI (accession-IDs NC_003888 and NZ_CP029197) and compared to S. griseofuscus DSM 40191 by calculating bidirectional best blastp hits between the genes34. We found that 3918 genes were shared across all three genomes (core genome), whereas S. griseofuscus shared additional 937 and 522 genes with S. coelicolor and S. venezuelae respectively (Fig. 2A, Supplementary Data 1). The net total of genes present across the three strains (pangenome set) was 13415. In order to understand the biological functions of the shared genes, we annotated all three genomes with the KEGG biological subsystems35. For the genome of S. griseofuscus, we found 2842 ortholog genes in the KEGG, which are involved in 1498 KEGG reactions. Whereas the genomes of S. coelicolor and S. venezuelae contained 8152 and 7112 genes, which map to 2900 and 2670 KEGG gene IDs, and 1475 and 1452 KEGG reaction IDs, respectively (Supplementary Data 1). After comparing the KEGG genes and reactions content, we found that the three genomes shared 1488 common KEGG gene IDs and 1280 reactions. Next, we compared the distribution of number of genes and number of reactions belonging to different KEGG pathways across three organisms (Fig. 2C,D). We found that the number of genes involved in subsystems such as membrane transport, signalling and cellular processes were significantly lower in S. griseofuscus than in others. However, the number of genes involved in subsystems, such as the metabolism of terpenoids and polyketides, genetic information processing, amino acid metabolism, and xenobiotics biodegradation and metabolism were significantly larger in S. griseofuscus (Fig. 2C). Comparison of the number of reactions across different metabolic pathways showed that the three strains shared very similar metabolism. However, the number of reactions involved in pathways, such as amino acid metabolism, was still larger in S. griseofuscus (Fig. 2D). Overall, we found that S. griseofuscus and S. coelicolor had a larger genomic and metabolic content than S. venezuelae. Additionally, high genomic and metabolic content was shared between S. griseofuscus and S. coelicolor. In the later section, we compare the phenotype microarray data of these strains to get experimental understanding of their metabolic growth capabilities.

Figure 2
figure2

Genome and phenotype microarray comparison of three strains. (A) Number of genes shared among the three organisms—S. griseofuscus (sgri), S. coelicolor (scoe), S. venezuelae (sven). (B) Phenotype microarray data represented by activity index rings generated using DuctApe across 4 biolog plates consisting of 379 carbon, nitrogen, phosphate and sulphate nutrient sources across the three organisms. (C) Distribution of the number of genes per KEGG subsystem across the three organisms calculated using KEGG Automatic Annotation Server35. 1—Protein families: signalling and cellular processes; 2—Protein families: genetic information processing; 3—Protein families: metabolism; 4—Carbohydrate metabolism; 5—Amino acid metabolism; 6—Metabolism of cofactors and vitamins; 7—Unclassified: metabolism; 8—Energy metabolism; 9—Poorly characterized; 10—Membrane transport; 11—Lipid metabolism; 12—Signal transduction; 13—Metabolism of terpenoids and polyketides; 14—Biosynthesis of other secondary metabolites, 15—Nucleotide metabolism; 16—Xenobiotics biodegradation and metabolism; 17—Translation; 18—Cellular community—prokaryotes; 19—Metabolism of other amino acids; 20—Replication and repair; 21—Folding, sorting and degradation; 22—Glycan biosynthesis and metabolism; 23—Unclassified: signalling and cellular processes; 24—Unclassified: genetic information processing; 25—Cell growth and death; 26—Transport and catabolism; 27—Drug resistance: antimicrobial; 28—Environmental adaptation; 29—Transcription. For further details on this dataset, please see Supplementary Data 1. (D) Distribution of the number of reactions per KEGG metabolic pathway across the three organisms. (a) Carbohydrate metabolism; (b) Amino acid metabolism; (c) Metabolism of cofactors and vitamins, (d) Lipid metabolism, (e) Nucleotide metabolism, (f) Metabolism of terpenoids and polyketides, (g) Energy metabolism, (h) Biosynthesis of other secondary metabolites, (i) Xenobiotics biodegradation and metabolism, (j) Metabolism of other amino acids, (k) Glycan biosynthesis and metabolism, (l) Translation, (m) Not included in regular maps, (n) Signal transduction. (E) Heatmap with activity index of different KEGG nutrients (y-axis) against the KEGG pathway maps (x-axis). (F) Confusion matrix representing a genome-scale model based prediction and the observed growth phenotypes of three organisms. (B) was generated using DuctApe36, whereas the remaining figures were generated in this study using Python scripts and assembled using Adobe Illustrator 23 https://www.adobe.com/products/illustrator.html.

Phenotype characterization

Comparison of physiological features of S. griseofuscus, S. coelicolor and S. venezuelae using BioLog microarrays

In order to characterize the phenotype of S. griseofuscus and its ability to utilize different substrates, we have conducted a multiple parallel cultivation using BioLog microarrays. This technology is not easily applicable for studying actinobacterial strains due to the formation of “clumps” of mycelia. However, in the case of S. griseofuscus, its simple growth characteristics enable such studies. As a direct comparison, we have used the well studied heterologous hosts S. coelicolor and S. venezuelae. Previously, parallel micro-scale cultivations were used for the characterization of an industrially important S. lividans TK2437.

We tested a total of 379 substrates, including 190 different carbon sources (PM1 and PM2), 95 nitrogen sources (PM3), 94 phosphate and sulphur sources (PM4). The kinetic growth data from Biolog was analyzed together with the genomes using DuctApe software36 that correlated genomic and phenomic data based on the KEGG metabolic pathways. An activity index between 0 to 9 was used to represent the growth on each substrate, where an activity index higher than 3 was used as a cutoff to define growth. We found that 171, 172 and 117 of the 379 substrates were utilized by S. griseofuscus, S. coelicolor and S. venezuelae, respectively (Fig. 2B, Supplementary Data 2). Comparing their growth, we found that 90 substrates were commonly utilized by all three strains, whereas 14, 19 and 7 substrates were utilized uniquely by S. griseofuscus, S. coelicolor and S. venezuelae, respectively. Some of the substrates uniquely utilized by S. griseofuscus include ethanolamine, 2-aminoethanol, cytidine, thymidine, d-serine and d-threonine. Additionally, we found that S. griseofuscus shared a total of 145 common growth substrates with S. coelicolor, signalling a high mutual metabolic similarity.

Next, we analyzed the growth on substrates with different nutrient source categories. We observed that a total of 72 carbon sources were utilized by S. coelicolor, which was higher than the number of carbon sources used by S. griseofuscus (64) and S. venezuelae (61). In particular, S. coelicolor could utilize more substrates involved in carbohydrate metabolism. Carbon sources uniquely utilized by S. griseofuscus include 2-aminoethanol, alpha-keto-valeric acid and D-malic acid. On the contrary, the number of nitrogen sources utilized were higher in S. griseofuscus (60) compared to S. coelicolor (52) and S. venezuelae (49). This could be primarily attributed to the categories of amino acid and other non-defined classes of metabolism. Unique nitrogen sources utilized by S. griseofuscus include l-phenylalanine, d-serine and ethanolamine. We found that S. venezuelae could only use 6 of the phosphate sources, a number that was substantially lower than in both S. griseofuscus (47) and S. coelicolor (46). Uniquely utilized phosphate sources by S. griseofuscus included 2-aminoethyl phosphonic acid and dithiophosphate. In general, we observe that the capability of S. griseofuscus to utilize different nutrient source categories is much higher than that of S. venezuelae, and is similar, or even higher than that of S. coelicolor. Comparison of the ability of S. griseofuscus to grow on different nutrient sources can guide the design of growth media and, thus, leads to optimal growth and metabolite production.

To investigate the connection between these growth activity profiles and the genomic diversity of the strains, a matrix was generated using the dape module of DuctApe, where the activity on different nutrients (rows) that are part of different KEGG pathways (columns) is highlighted (Fig. 2E, Supplementary Data 3). For example, the average growth activity indices of all the nitrogen source nutrients belonging to the KEGG pathway of the biosynthesis of amino acids (map:01230) were 6.51, 5.96 and 4.18 for S. griseofuscus, S. coelicolor and S. venezuelae, respectively. In particular, the thiamine metabolism pathway (map:00730) showed a higher average growth index on nitrogen nutrients in S. griseofuscus (7.11) as compared to S. coelicolor (5.89) and S. venezuelae (5.78). Overall, the growth activity heatmaps of nutrients vs the KEGG pathways were similar in S. griseofuscus and S. coelicolor, whereas, S. venezuelae was found to have lower growth activity across nutrients from different pathways. The higher genomic similarity between S. grisoefuscus and S. coelicolor that was observed in the previous section further corroborates with this phenomic similarity. In addition to this genome to phenome comparison based on the KEGG pathways, we used genome-scale metabolic models to compare the in-silico predicted growth against observed phenotypes across different substrates (Fig. 2F). We reconstructed draft genome-scale metabolic models for S. griseofuscus and S. venezuelae based on homology comparison against the genome scale model of S. coelicolor (Supplementary Data 4). The models also predicted growth on a larger number of nutrients in the cases of S. griseofuscus and S. coelicolor as compared to S. venezuelae (Supplementary Data 5). Thus, we conclude that S. griseofuscus possesses very similar or even superior, metabolic capabilities compared to well-studied Streptomyces strains.

Secondary metabolite potential of Streptomyces griseofuscus

Analysis of the genome using antiSMASH and BiG-SCAPE

In order to estimate the capabilities of the strain to synthesize secondary metabolites, it is important to characterize the BGCs present in the genome. We therefore carried out a genome mining analysis using antiSMASH3. We detected 35 regions of BGCs encoding for different types of secondary metabolites on the chromosome. These regions can be split into 53 candidate clusters. The megaplasmid pSGRIFU1 (CP051007) includes one antiSMASH-predicted region with seven candidate clusters. No BGCs were detected on pSGRIFU2 and pSGRIFU3. We observed that the genome of S. griseofuscus harbored 4 NRPS, 3 PKSI, 5 PKS-NRPS hybrids, 4 other PKS types, 4 terpenes, 4 RiPPs and 11 other types of BGCs as defined by antiSMASH (Supplementary Table S1). Only few of these BGCs putatively code for known secondary metabolites, such as hopene, geosmin, spore pigment, desferrioxamine B, ectoine and pentamycin. Further, two of the candidate clusters from the plasmid showed similarity to the known BGC encoding for lankamycin and lankacidin C. The remaining 29 BGCs code for unknown and potentially novel secondary metabolites (Supplementary Table S1).

In order to investigate if the BGCs in S. griseofuscus are also detected in other Streptomyces genomes, we carried out a BGC similarity analysis involving a dataset of 212 publically available complete high-quality Streptomyces genomes. In total, 6380 BGCs of different types were detected across this dataset of genomes. We generated a similarity network of 35 regions, 12 manually selected candidate clusters from S. griseofuscus, 6380 BGCs from public genomes and 1808 known BGCs from MIBIG database40 using BiG-SCAPE38. The network with the cutoff of 0.3 raw_distance metric was further analyzed using Cytoscape39 (Fig. 3). All BGC families that did not include one of the BGCs from S. griseofuscus were ignored for the subsequent analyses (Supplementary Data 6). We found that only one BGC (region 14) of the NRPS-like type was a singleton in the network, uniquely observed in S. griseofuscus. We observed that 8 of the BGCs were exclusively present in one other genome, namely S. rochei 7434AN4. In addition, 9 BGCs are also present in Streptomyces. sp. endophyte_N2 (GenBank Accn.: CP028719) in addition to S. rochei 7434AN4 (GenBank Accn.: AP018517). This suggests that these 17 BGCs from S. grieseofuscus are also rarely observed across streptomycetes. Among the BGCs that are relatively common across the dataset, we found that candidate cluster 50 of region 33 had similar BGCs across 18 other Streptomycetes, including S. collinus Tü 365 (Supplementary Fig. S3), whereas the candidate cluster 51 was similar to the known cluster encoding for pentamycin. Overall, we have established that most of the BGCs (33) were also present in S. rochei 7434AN4, indicating two genomes with highly similar content.

Figure 3
figure3

Similarity network of BGCs in S.griseofuscus against 212 public genomes and MIBIG database of known BGCs. Comparison of all the regions and few selected candidate clusters against BGCs of known compounds from MIBIG database and 212 public genomes. Different colors denote different types of BGCs as shown in legends. BGCs from S. griseofuscus, S. rochei 7434AN4, Streptomyces. sp. endophyte_N2 and MIBIG database are shown with different shapes and sizes. All regions, selected candidate clusters and known BGCs are annotated by text. The similarity network was generated using BiG-SCAPE38 and visualised using Cytoscape39. Detailed comparison of selected BGCs using CORASON38 can be found in Supplementary Fig. S3.

This similarity has led us to examine the relation of S. griseofuscus to other strains of its species and to S. rochei strains. Currently, there are 4 complete assemblies of S. griseofuscus and 3 of S. rochei genomes available in NCBI database. Among these are the aforementioned S. rochei 7434AN4 and type strain S. rochei NRRL B-2410. By calculating pairwise Average Nucleotide Identity (ANI) between all genomes41, we have identified the ANI between S. griseofuscus DSM40191 and S. rochei 7434AN4 to be at 99.54%, while the similarity of S. rochei 7434AN4 to the type strain S. rochei NRRL B-2410 is at 84.03%, highly similar to the one between S. rochei NRRL B-2410 and S. griseofuscus DSM 40191, 83.95%. This clearly signals that the S. rochei 7434AN4 strain was probably misclassified and is indeed a S. griseofuscus strain. This may explain the large number of similar BGCs shared between S. griseofuscus DSM 40191 and S. rochei 7434AN4 and the high levels of similarity between two of the largest plasmids in both strains pSGRIFU1 and pSLA2-L.

Characterization of secondary metabolites produced by the S. griseofuscus

In the genome mining study, we have identified several known BGC that were studied in other strains. Among these are the lankacidin and lankamycin BGCs, encoded on plasmid pSGRIFU1, previously studied in S. rochei42,43, as well as the pentamycin BGC3344. Due to the good preservation of these BGCs in the genome, we expected to detect the production of lankamycin, lankacidin-related compounds and pentamycin. In addition, it was reported previously, that some strains of S. griseofuscus are able to produce azinomycins A and B45, acetylcholine esterase inhibitor physostigmine46, ϵ-poly-l-lysine28,29 and lankacidin C and A47. In order to check the production in S. griseofuscus DSM 40191, we have performed exploratory cultivations in 5 different liquid media (ISP2, MAM and CDMZ medium46, minimal medium (MM)48, and medium 65)49, that were described in the literature for the production of the compounds mentioned above, and have attempted to identify them in the extracts. Lankacidin A, C, and lankamycin were tentatively identified by HR-MS. The production of pentamycin was identified by HR-MS and confirmed using the pentamycin standard (Supplementary Fig. S4). Physostigmine was not detected in any of the conditions (Supplementary Table S2).

We have noted the production of hydrophobic extracellular vesicles by S. griseofuscus, a widely spread, but poorly studied phenomenon among Actinobacteria50. It is known that the extracellular vesicles might contain secondary metabolites51,52. To study the profile of the extracellular vesicles in S. griseofuscus, they were collected and directly injected for LC–MS measurements. Among many compounds, we have tentatively identified lankamycin, that was previously detected in the cultivation extractions.

Development of genetic engineering methods

Even though the transformation, conjugation and protoplast generation for S. griseofuscus was established, including attempts of genetic engineering31,32,53, it was never systematically tested with different vectors and engineering methods. While generating a heterologous host strain, it is important to have access to the fastest knockout-leading techniques that lead to the least off-target modifications.

Transfer of integrative and replicative GusA-based vectors

As a first step, we tested whether S. griseofuscus is compatible with the gusA reporter system plasmids54: pSETGUS, an integrative phiC31- based plasmid, and pKG1139, a replicative plasmid. Both plasmids were successfully conjugated into S. griseofuscus and allowed for visual screening of the exconjugant colonies (Supplementary Fig. S5). To determine the position of the pSETGUS integration site, which is of importance to rationally utilize it for the integration of desirable elements, we randomly picked three independent S. griseofuscus pSETGUS colonies and sequenced them using Oxford Nanopore sequencing, similarly to Gren et al.55. The exact location of the integration site is at 4,242,328 bp in the HEP81_03793 gene, coding for a putative chromosome condensation protein. The determined attB site of S. griseofuscus contains the conserved core “TT” sequence56.

CRISPR-Cas9 mediated gene knockout

CRISPR-Cas9-based molecular tools offer precision and ease in handling in comparison to other techniques. Over the recent years, CRISPR tools have been adapted for use in streptomycetes57. As the introduction of double strand breaks can lead to rearrangements and off-target effects in the genome, we validated various CRISPR-Cas9-based engineering methods by targeting genes on the chromosome and on one of the plasmids. For this purpose, we used a pGM1190-based CRISPR-Cas9 plasmid58, based on a temperature sensitive replicon that was shown to be functional in S. griseofuscus, by using GusA-based pKG1139.

As a first target, we wanted to eliminate plasmid pSGRIFU1 that harbours 4 BGCs, among them the lankacidin, lankamycin, a cryptic polyketide and the carotenoid BGCs. This plasmid has a very high similarity to the plasmid pSLA2-L of S. rochei, where these clusters were characterized43. A sgRNA was designed to target the DNA primase/helicase-coding region, which is essential for plasmid replication. Three random colonies were selected after the CRISPR procedure and sequenced via Illumina whole genome sequencing. Surprisingly, in all clones, both the targeted pSGRIFU1 and pSGRIFU2 were lost, leaving only plasmid pSGRIFU3 present in the genome. To estimate the amount of changes in the plasmid-cured strains in comparison to the wild type genome, the WGS data was analyzed with breseq, which identified 11 mutations (six SNVs, three insertions, and two deletions) (Fig. 1). One of the colonies was selected for further work and named DEL1.

In parallel, we attempted to knockout chromosomally located BGC region number 33, which encodes a putative pentamycin BGC and an uncharacterized NRPS BGC (Supplementary Fig. S3). The conjugation of the knockout plasmid resulted in less than 10 colonies, 2 of which were selected for Illumina MiSEQ sequencing. It revealed that even though both clones accumulated several mutations, they did not contain the intended mutation (data not shown). Even after the experiment was repeated, we were not able to select knockout-carrying colonies.

In order to verify whether the deletion of pentamycin-NRPS clusters is possible in the plasmid-cured conditions, a knockout plasmid was transferred to DEL1. In contrast to the experiments with the wild type, a large number of exconjugants was received. After the plasmid curing, three of the independently received colonies were sequenced with Illumina NextSeq and one of them was additionally sequenced using Nanopore technology. This clone, further referred to as DEL2, was confirmed to contain a full deletion of the pentamycin-NRPS cluster region and contained a comparatively small amount of mutations (Fig. 1).

In the strain S. rochei 7434AN4, which is closely related to S. griseofuscus, curing of all three plasmids has been reported to change the topology of the chromosome from linear to circular43,59. It is believed that the taptpg gene pair, which encodes for telomere-associated protein and a terminal protein for end patching, located on both pSLA2-L and pSLA2-M plasmids, is responsible for maintaining the linear architecture of the chromosome. Because both the genomes and the associated plasmids in S. rochei and S. griseofuscus are similar, we investigated if the chromosome of S. griseofuscus had circularized during the plasmid curing. We therefore sequenced the strain DEL2 with the Nanopore technology. The assembly graph clearly showed a chromosome with inverted repeat consistent with a linear chromosome. In order to verify the presence of the tap-tpg homologues in the genome of S. griseofuscus, a BLAST search was performed against each gene pair from pSLA2-L and pSLA2-M. The homologues of tapR1tpgR1 and tapRMtpgRM were found on all three plasmids of S. griseofuscus, but not its chromosome (Supplementary Table S4). This could explain the preserved linear topology of the DEL2 chromosome. The removal of the putative pSGRIFU1 and pSGRIFU2 tap/tpg homologues, may be complemented by the remaining homologous genes present on pSGRIFU3.

Both the DEL1 and DEL2 strains did not show any significant changes in their morphology, growth or sporulation (Supplementary Figs. S1, S2, Supplementary Table S3). In order to verify the influence of genetic manipulations on the metabolites produced by DEL1 and DEL2, parallel cultivations in ISP2 media were made. In comparison to the wild type, strain DEL2 lost the possibility to produce pentamycin, lankacidins and lankamycin, as expected (Supplementary Fig. S4).

In order to test whether S. griseofuscus is suitable for the expression of heterologous BGCs, we have expressed the S. coelicolor actinorhodin BGC in the wild type and DEL2. As evident from the formation of a dark-blue halo, the wild type and DEL2 strains are both potentially able to produce actinorhodin in heterologous conditions, however further tests are required to unequivocally prove it (Supplementary Fig. S6).

CRISPR-cBEST mediated knockouts

The CRISPR-cBEST system60 utilizes cytidine deaminase fused to dCas9 and allows for the introduction of STOP-codons by converting CG base pairs to AT. Recently, we have reported the use of this system in S. griseofuscus60. In order to test the usability of CRISPR-cBEST for engineering of S. griseofuscus, the targeted BGCs were selected on the so-called “arm” regions of the chromosome60. It is known that the introduction of the DNA double strand breaks by Cas9 might lead to multiple unwanted consequences and is particularly dangerous in the case of the ends of the chromosome61. Therefore, these BGC regions are particularly difficult to engineer. In order to verify whether CRISPR-cBEST system would help to omit these limitations, the targets were selected in 4 different BGCs-containing regions, number 4, 30, 31 and 34 on the right and left arms of the chromosome. The pCRISPR-CBE plasmids were constructed according to the protocol62, sequenced and transferred to S. griseofuscus via conjugation. Correct clones with the STOP-codons in BGC regions 4, 30, 31 and 34 were confirmed via Sanger sequencing of the region of interest60. In order to determine the outcomes of each mutation, the morphology, growth and metabolite production was assessed and individually described (Supplementary Figs. S1, S2, Supplementary Table S3). We have grown all of the CRISPR-cBEST generated mutants in ISP2 liquid media and compared their production profiles to the wild type (data is not shown). In the initial tests, we were not able to identify specific metabolites produced from each of these BGCs, possibly because the production conditions for these metabolites were not met, or they are cryptic.

It has been shown that by using the multiplexed CRISPR-cBEST plasmids it is possible to target multiple genes from different BGCs in S. coelicolor60. Therefore, our next target was to verify such a possibility in S. griseofuscus. For this purpose, a multiplex plasmid was constructed, targeting 4 BGCs on the left arm of the chromosome. The sgRNA guides selected earlier were used, yielding plasmid pCRISPR-MCBE-1-2-4-6, targeting BGC region 1 (gene HEP81_00133), BGC region 2 (gene HEP81_00319), BGC region 4 (gene HEP81_00378) and BGC region 6 (gene HEP81_00485). The plasmid was verified via Sanger sequencing and transferred to S. griseofuscus via conjugation. Up to 24 exconjugant colonies were tested via PCR. Each of the targeted regions was amplified using a selected set of primers, the fragments were purified and sequenced by Sanger sequencing. As a result of the screening, for each of the targeted regions, at least one successful editing event was detected. We were able to select a colony of S. griseofuscus pCRISPR-MCBE-1-2-4-6 with a total of three edited targets (E3I2) (Supplementary Table S5). Strain E3I2 has exhibited signs of sporulation deficiencies and changes in morphology, that might be related to the specific combination of the mutations that were introduced (Supplementary Fig. S1). However, the growth of this strain was clearly not inhibited in liquid cultures (Supplementary Fig. S2, Supplementary Table S3). In addition, the metabolite biosynthesis profile of E3I2 was verified in ISP2 liquid media (data is not shown). We were not able to identify specific metabolites linked to the inactivated BGCs, probably because the conditions for the production of these metabolites were not met or these particular BGCs were not expressed.

One of the significant problems for CRISPR-Cas9-mediated targeting is the unwanted off-target effects. Similarly, such problems exist while using CRISPR-BEST systems. It was shown that while using the CRISPR-BEST for the generation of knockouts in S. coelicolor, a relatively small amount of mutations can be observed60. However, the influence of the presence of a/the CRISPR-BEST plasmid on the accumulation of the mutations over time during continuous cultivation was never studied.

In order to study these effects we have performed a long term cultivation experiment with CRISPR-cBEST generated mutant strain S. griseofuscus HEP81_06602 (p057). The initial and resulting strains were sequenced using Illumina NextSEQ and compared to the wild type strain, using breseq analysis (Figs. 1, 4). Notably, the introduced Trp221Stop mutation in the putative gene HEP81_06602 (BGC 30) was maintained even after 20 transfers without the antibiotic pressure (Fig. 1).

Figure 4
figure4

Genome wide off-target evaluation of CRISPR-BEST mediated mutations in the strain preserved and sequenced after the introduction of Stop-codon (p057_0D; 2) in comparison to the same strain that was passed consecutively 20 times in liquid cultures (p057_20D; 3). Mutations, predicted for the WT Illumina dataset (1), used to produce the reference, the level of which can be considered a technical noise. This figure was made using online illustrator draw.io (https://app.diagrams.net/).

The breseq analysis of the Illumina-generated reads of the wild type genome had revealed 3 single-base pair mutations, which can be considered a technical baseline. In case of strain p057_0D this number increased to 33 mutations altogether, with a majority of them being C to T exchanges, which can be putatively attributed to nonspecific activity of the CRISPR-BEST cytidine aminase. After 20 consecutive transfers, this number increased to 50 mutations with a majority of them being C to T and A to G exchanges. Both numbers are falling in the range of the previously reported for S. coelicolor60 and are promising for the engineering of S. griseofuscus.

Source link