Synthetic defects in NAD+ regeneration abolishing anaerobic growth can be rescued by metabolically isolated foreign enzyme-substrate pairs
To implement the artificial-selection system, a genetically-modified E. coli strain (strain AL) unable to regenerate NAD+ during anaerobic fermentation was constructed by inactivating the genes encoding the ethanol pathway (bifunctional aldehyde-alcohol dehydrogenase, adhE) and, to eliminate a potential mutational escape route7, the lactate pathway (lactate dehydrogenase, ldhA). When cultured aerobically in a minimal medium, AL cells grew similarly to WT cells. However, they were unable to grow anaerobically in the same medium, indicating the expected impairment of anaerobic fermentation (Fig. 1c). We validated this strain by genetic complementation, through a transformation with a plasmid bearing the WT adhE gene, which allowed anaerobic growth similar to WT cells, reaching a maximum optical density at 600 nm (OD600) of 0.8 by 24−36 h (Fig. 1d). 1H–NMR spectra of the fermentation broth confirmed the profile of metabolites was similar to that of the parental strain (except for the expected absence of lactate, Supplementary Table 1).
We attempted to restore anaerobic growth of AL cells using exogenous enzyme-substrate pairs. In each case, the substrate was not expected to be natively produced or consumed by E. coli, so the main link to the E. coli metabolic network was through NAD. Three NADH-dependent oxidoreductases were tested: acetoin reductases from Bacillus subtilis (BDHA)16 and Klebsiella pneumoniae (BUDC)17, and alcohol dehydrogenase from Thermus sp. ATN1 (TADH)18 (Fig. 1e). AL cells transformed with any of the exogenous reductases were able to grow anaerobically in minimal medium only if also supplemented with the corresponding substrate (acetoin for BDHA and BUDC, and cyclohexanone or 3-methylcyclohexanone for TADH) (Fig. 1f−h). The substrates alone did not restore growth except for acetoin, with which growth was eventually observed, probably due to the activity of an endogenous acetoin reductase11,15. The consumption of substrates and generation of corresponding reduced products was confirmed in all three cases using 1H-NMR (Supplementary Table 1). These results with cyclohexanone and 3-methylcyclohexanone indicate that NAD+ regeneration alone, without any effects of carbon flux with metabolism, is enough to rescue AL cells from their metabolic defect.
High-throughput artificial selection outperforms computational design of cofactor specificity reversal
NAD(P)-dependent oxidoreductases have been extensively engineered to make them more suitable for specific applications, targeting properties such as kinetic parameters, substrate specificity, or cofactor preference19,20,21,22. Switching the cofactor preference of NADP-dependent enzymes to NAD is of particular interest due to the lower cost and higher stability of NAD, and the higher efficiency of cell-free NAD recycling systems23,24, for which low-cost co-substrates are available. Substrate selectivities, and modifications to them, are not generally computationally predictable. NAD/NADP selectivity is a partial exception, as the many known examples and a common structural motif enabled the development of a computational design tool, CSR-SALAD, which represents the state of the art in cofactor-selectivity reversal25,26.
Alcohol dehydrogenases (ADHs), also known as ketoreductases (KREDs), are of great industrial interest primarily due to their ability to perform an asymmetric reduction of aldehydes and ketones. Thanks to their regioselectivity and stereoselectivity, ADHs can be used not only to produce enantiomerically pure alcohols, but also other types of compounds, such as γ- and δ-lactones through the desymmetrisation of meso-diols27,28. The primary-secondary alcohol dehydrogenase of Clostridium beijerinckii (CBADH) reduces this organism’s waste acetone to isopropanol, but is strictly NADP-dependent, which is unusual in anaerobic fermentation29 (Fig. 2a). An efficient NAD-utilizing acetone reductase variant of CBADH would be useful in industrial biotechnology and cofactor recycling, so we set out to evolve one.


a CBADH (Clostridium beijerinckii alcohol dehydrogenase) and TBADH (Thermoanaerobacter brockii alcohol dehydrogenase) catalyze the reduction of acetone to isopropanol exclusively using NADPH, as well as the reverse reaction. b Cofactor binding site of CBADHWT with NADPH bound (PDB code 1KEV). The S199, R200, and Y218 residues interact with the 2′ phosphate of NADPH. Hydrogen bonds are shown in green. c Anaerobic culture of, AL cells transformed with library CBADHLib (pLS10) in M9 glucose medium supplemented with acetone. Controls are as in Fig. 1. CBADHWT (pLS6) was able to restore the slow anaerobic growth of AL cells due to the activity of transhydrogenases. d Anaerobic growth of AL cells transformed with CBADHS (with substitutions G198D, S199Y, and Y218P; pLS10_3) in medium with acetone. e Cofactor binding site of TBADHWT with NADPH bound (PDB code 1YKF). A similar set of residues as in CBADH establishes interactions with the 2′ phosphate of TBADH. Hydrogen bonds are shown in green. f Anaerobic culture of ALPS cells in medium supplemented with acetone. CBADHS, which can use NAD, supported anaerobic growth of ALPS. Unlike AL cells transformed with CBADHWT, ALPS cells transformed with CBADHWT did not grow anaerobically, due to the absence of any transhydrogenase activity to regenerate NAD+ from NADP+. g Anaerobic culture of ALPS cells transformed with CBADHS, TBADHS1 (with substitutions G198S, S199K, R200P, and Y218V; pLS73_2), TBADHS2 (with a duplication of residues 191−241 and substitutions G198H, S199R, R200A, Y218M, G198’A, and R200’K; pLS73_1) or TBADHWT (pLS69) in medium supplemented with acetone. All NAD-utilizing variants supported the anaerobic growth of ALPS. TBADHWT, like CBADHWT, could not support the anaerobic growth of ALPS cells. Data points of growth curves represent mean values, with error bars showing standard deviation; n = 3 biologically independent cultures for all timepoints of growth curves. Source data are provided as a Source Data file.
CSR-SALAD was used to analyze CBADH, predict residues critical for cofactor preference, and design a cofactor preference reversal strategy. The software suggested 567 protein variants encoded by a library of 648 genetic variants of positions G198 (substitutions to DEGKNRS), S199 (to DGHILNRSV), and Y218 (to ADFINSTVY), located at the cofactor binding pocket close to the 2′ phosphate group of NADP (Fig. 2b). CSR-SALAD assumes reversal would cause substantial loss of activity, and proposed subsequent rounds of site-saturation mutagenesis and screening to attempt to recover activity, initially prioritizing I175 and R200. A previous manual analysis of the CBADH crystal structure proposed substitutions of residues G198, S199, R200, and Y218 to reverse cofactor preference29, but neither the specific substitutions proposed nor saturation mutagenesis of S199 yielded any NADH-dependent variants of the closely-related Clostridium autoethanogenum ADH30. Here we applied the proposed substitutions to CBADH, constructing variants CBADHR1 (S199G, R200G, and Y218F) and CBADHR2 (G198D, S199G, R200G, and Y218F), but plasmids encoding these variants did not restore anaerobic growth of AL cells in minimal medium supplemented with 15 mM acetone, indicating low or absent NADH-dependent activity.
We performed full combinatorial saturation mutagenesis of all four target residues of CBADH (G198, S199, R200, and Y218) replacing each codon with NNN to yield a library (CBADHLib) of 16.8 million unique genetic variants encoding 160,000 unique protein variants. Every variant proposed by CSR-SALAD was included, comprising only 0.4% of the library. By including R200, the activity loss that CSR-SALAD envisages as an emergent problem to be solved by subsequent recovery is addressed simultaneously during cofactor preference reversal.
To select for CBADH variants able to accept NAD, AL cells were transformed with the library, and pools of variants were incubated anaerobically in a minimal medium supplemented with 15 mM acetone, so that any variant with NADH-dependent acetone-reducing activity would lead to a regeneration of NAD+, restoring growth. For this library and each similar case in this study, multiple independent transformations (six for this library) were used to obtain complete library coverage (approximately 18 million transformant clones in this case) and samples were sequenced to confirm diverse codon incorporation. Clones from each transformation were pooled and subjected to selection separately. Growth was observed in three of the variant pools (Fig. 2c) so these were each subcultured under the same conditions once, then plasmid DNA was extracted and sequenced, revealing the presence of the same CBADH variant in each case (CBADHS) with G198D, S199Y, and Y218P substitutions. Interestingly, NAD-dependent mouse class II alcohol dehydrogenase (ADH2)31, which is distantly related to CBADH, contains D227 and P247 which are equivalent positions to 198 and 218 in CBADH. CBADHS was not among the 567 variants proposed by CSR-SALAD. The selection of the same variant in three independent experiments indicates the superiority of this variant, the strength of the artificial selection pressure, and its utility to isolate variants. Transforming AL cells with isolated plasmid encoding CBADHS enabled fast anaerobic growth in a medium containing acetone (Fig. 2d). Acetone consumption and corresponding isopropanol generation were confirmed by 1H-NMR (Supplementary Table 1). Unlike the parent enzyme CBADHWT, purified CBADHS was unable to oxidize isopropanol using oxidized NADP (NADP+) as the cofactor, but could do so with NAD+ (Km = 17.49 mM, kcat = 333 min−1 and kcat/Km = 316.67 M−1 s−1, Supplementary Fig. 1 and Supplementary Table 2).
Eliminating transhydrogenase activities provides strict cofactor selection allowing exceptional reversal of cofactor preference
Unexpectedly, we observed that AL cells were able to slowly grow anaerobically in a medium with acetone when transformed with the parent enzyme CBADHWT, despite its strict NADP-dependence (Fig. 2c, d). We hypothesized that acetone-dependent anaerobic growth recovery by CBADHWT could be mediated by the activity of one or both of E. coli’s two transhydrogenases, PNTA and STHA, which could use the NADP+ produced by CBADHWT to oxidize NADH. To test this hypothesis, we generated E. coli strains with knockout mutations of transhydrogenase genes pntA (ALP strain), sthA (ALS strain), or both (ALPS strain) in addition to knockout mutations of adhE and ldhA. We then tested the ability of CBADHWT and CBADHS to restore anaerobic growth of ALP, ALS, and ALPS cells. The ALP and ALS strains, each with one intact transhydrogenase gene, were able to grow anaerobically in a medium with acetone when transformed with either NADP-dependent CBADHWT or NAD-dependent CBADHS (Supplementary Fig. 2). However, only NAD-dependent CBADHS could support acetone-dependent anaerobic growth of ALPS cells, which lack either intact transhydrogenase gene (Fig. 2f and Supplementary Fig. 2). This demonstrates that transhydrogenases are indeed responsible for the acetone-dependent recovery of anaerobic growth by CBADHWT, and that either STHA or PNTA alone is sufficient to generate the required NAD+ to sustain anaerobic growth. Interestingly, this is thought to be the non-physiological direction for the membrane-bound, ‘energy-linked’ transhydrogenase PntA32. Due to its inability to grow anaerobically when transformed with an NADP-dependent oxidoreductase, the transhydrogenase-free ALPS strain provides a more stringent selection system than the AL, ALP, or ALS strains, strictly requiring NAD-dependent oxidoreductase activity to restore growth.
To validate the ALPS strain as a selection host, we used it to evolve NAD-dependent variants of TBADH, an NADP-dependent alcohol dehydrogenase from Thermoanaerobacter brockii closely related to CBADH29. Similarly to CBADHWT, WT TBADH was able to support the anaerobic growth of AL cells in a medium supplemented with acetone, but not of ALPS cells (Fig. 2g). We performed full combinatorial saturation mutagenesis of residues G198, S199, R200 and Y218 of TBADH to generate a library (TBADHLib) of 16.8 million unique genetic variants (Fig. 2e). A similar selection procedure as for CBADH was followed but using the ALPS strain, and two different variants were identified, TBADHS1 and TBADHS2. TBADHS1 had substitutions G198S, S199K, R200P, and Y218V. Surprisingly, TBADHS2 contained a duplication of residues 191−241, in addition to substitutions in the targeted residues, both in the positions of the original sequence and in the corresponding positions of the duplication (Supplementary Fig. 3). The substitutions were G198H, S199R, R200A, Y218M, G198’A, and R200’K (198’ and 200’ denote positions 249 and 251 of TBADHS2, the positions in the duplication equivalent to the original 198 and 200 residues). Since the duplication was not intentionally introduced into the variant library by design, it presumably arose through a rare event during the PCR, ligation or within cells, highlighting the potential of the selection system to isolate variants with desired properties even when these are rare and outside the intended design space.
TBADHS1 and TBADHS2 were both able to restore the anaerobic growth of ALPS cells in a medium with acetone (Fig. 2g), and 1H-NMR of the fermentation broth confirmed the production of isopropanol (Supplementary Table 1). When the enzymatic activity was assayed, TBADHS1 could oxidize isopropanol only with NAD+, certifying the reversal of cofactor preference (Supplementary Fig. 1 and Supplementary Table 2). The kcat of TBADHS1 for the oxidation of isopropanol with NAD+ was 112 ± 5.7 min−1, 4.5 times lower than the kcat of TBADHWT for the same reaction with NADP+. However, the Km of TBADHS1 for isopropanol was 3.74 ± 0.54 mM, a 32-fold decrease (improvement) compared to the Km of TBADHWT for isopropanol in the presence of NADP+. Overall, the catalytic efficiency (kcat/Km) of TBADHS1 for the oxidation of isopropanol (kcat/Km = 496.67 M−1 s−1) was more than seven times greater than that of TBADHWT with NADP+ (kcat/Km = 70 M−1 s−1). This is, to our knowledge, the highest relative catalytic efficiency obtained in any case of cofactor specificity reversal for an alcohol dehydrogenase, and the best for the reversal of preference from NADP to NAD for any enzyme. In contrast, TBADHS2 was able to oxidize isopropanol both with NAD+ and NADP+ (in the presence of NAD+, Km = 22.07 mM, kcat = 238.5 min−1 and kcat/Km = 180 M−1 s−1; in the presence of NADP+, Km = 55.15 mM, kcat = 231.4 min−1 and kcat/Km = 70 M−1 s−1; Supplementary Fig. 1, Supplementary Table 2). A decrease in kcat compared to TBADHWT was observed, but the Km for isopropanol decreased (improved) in the presence of both cofactors.
We crystallized CBADHS and TBADHS1 in order to elucidate the structural basis for the observed reversal of cofactor preference. The resulting maps showed clear density for NAD+ in the case of CBADHS, but only partial occupancy for TBADHS1 (Supplementary Fig. 4). In both cases, the size of the cofactor-binding pocket is reduced due to the substitution of residues 198 and 199 by others with bulkier side chains (Fig. 3), which sterically prevents the binding of the 2′ phosphate of NADP. In the case of CBADHS, this effect is further enhanced by the presence of an aspartate residue at position 198, which would also prevent NADP binding through electrostatic repulsion between the side-chain carboxylate group and the 2′ phosphate of NADP. Another common feature of both CBADHS and TBADHS1 is the substitution of Y218. The side chain of Y218 is known to undergo a 120° rotation in the WT enzymes to allow stacking to the adenine moiety of NADP and the formation of a hydrogen bond with the 2′ phosphate through its hydroxyl group29. As shown by the CBADHS and TBADHS1 structures, the side chains of the substituted residues are not close enough to the 2′ phosphate to interact with it, and would only be able to form hydrophobic interactions with the adenine moiety, if any (Fig. 3).


a Cofactor binding site of CBADHS with NAD+ bound. The substitutions identified in CBADHS allow for the binding of NAD+ but would prevent the binding of NADP+ by steric impediments and electrostatic repulsion with the side chains of D198 and Y199. Additionally, the stacked-ring interaction of the adenine moiety with Y218 observed in CBADHWT cannot be established due to the Y218P substitution, possibly enabling a more flexible binding of the cofactor. b Cofactor binding site of TBADHS1, with NAD+ placed at the same position as NADP+ in the structure of TBADHWT. P200−P201 is modelled as a cis peptide bond. While TBADHS1 can accommodate NAD+ in its cofactor binding pocket, the binding of NADP+ would be prevented by steric impediments caused by the side chains of S198 and K199. As in CBADHS, the substitution of Y218 prevents the formation of a stacked-ring interaction with the adenine ring of the cofactor, possibly enabling a more flexible binding of NAD+.
Simultaneous optimization of multiple kinetic parameters by high-throughput artificial selection
Next, we applied the artificial selection system to an imine reductase (IRED). IREDs are of great industrial interest thanks to their ability to catalyze the asymmetric reduction of imines and the reductive amination of ketones, both of which yield chiral amines, fundamental building blocks in the pharmaceutical and agrichemical industries33,34. All known natural IREDs are NADP dependent35, so there is great interest in developing NAD-dependent variants due to the lower cost and higher efficiency of NAD-regeneration systems. The most active NAD-dependent IRED thus far is MsIREDC, a variant of Myxococcus stipitatus IRED (MsIRED), which is able to reduce 2-methyl-1-pyrroline amongst other substrates (Fig. 4a). MsIREDC was obtained through several rounds of mutagenesis and screening36. We aimed to obtain a superior NAD-dependent variant of MsIRED through a faster and simpler workflow by applying the artificial selection system.


a Reaction catalyzed by MsIRED (Myxococcus stipitatus imine reductase). MsIRED can reduce several imines to the corresponding amine using NADPH as cofactor. b Homology model of the cofactor binding site of MsIREDWT with NADPH bound. The model was generated by threading the MsIRED sequence into PDB code 3ZHB. Residues 32, 33, 34, and 37 were predicted to be close to the 2’ phosphate of NADPH. Hydrogen bonds are shown in green. c Anaerobic culture of AL cells transformed with MsIREDWT (pLS130), MsIREDC (with substitutions N32E, R33Y, T34E, K37R, L67I, and T71V; pLS131) and library MsIREDLib (pLS133) in medium with 2-methyl-1-pyrroline. Controls are as in Fig. 1. d Anaerobic culture of AL cells transformed with MsIREDWT, MsIREDC, and MsIREDS (with substitutions N32D, R33V, T34R, and K37R) in M9 glucose medium with 2-methyl-1-pyrroline. MsIREDS caused the best anaerobic growth of AL cells. e Comparison of enzymatic activity at different concentrations of 2-methyl-1-pyrroline for MsIREDC and MsIREDS. Data points of panels c−e represent mean values, with error bars showing standard deviation. For panels c and d, n = 3 biologically independent cultures for all timepoints of growth curves. For panel e, n = 3 biologically independent assays for all substrate concentrations with both enzymes. Source data are provided as a Source Data file.
CSR-SALAD predicted residues N32, R33, T34, and K37 to be critical determinants of cofactor specificity (Fig. 4b). Thus, we performed full combinatorial saturation mutagenesis of these residues to generate a library (MsIREDLib) of 16.8 million unique genetic variants. AL cells transformed with the library were able to grow anaerobically in medium with 2-methyl-1-pyrroline (Fig. 4c), and the same MsIRED variant (MsIREDS) was identified from three independent transformations of the library, comprising residue substitutions N32D, R33V, T34R, and K37R. We analyzed the fermentation broth of AL cells transformed with MsIREDS (Fig. 4d) by 1H-NMR, confirming the consumption of 2-methyl-1-pyrroline and the production of the corresponding amine, 2-methylpyrrolidine (Supplementary Table 1). The enzymatic activities of MsIREDC and MsIREDS were compared (for MsIREDC, Km = 34.06 mM, kcat = 161.17 min−1 and Ki = 4.94 mM; for MsIREDS, Km = 19.57 mM, kcat = 78.1 min−1 and Ki = 11.42 mM; Supplementary Fig. 5, Supplementary Table 2). Both variants were able to reduce 2-methyl-1-pyrroline only with NADH and showed substrate inhibition, as shown by the decrease in activity at the highest concentrations of substrate. MsIREDS performed better than MsIREDC at all tested substrate concentrations (Fig. 4e), partly due to the higher value of the substrate inhibition constant Ki, which indicates a relief of substrate inhibition in MsIREDS compared to MsIREDC. To our knowledge, MsIREDS has the highest NAD-dependent IRED activity yet reported. These results highlight the ability of the artificial selection system to obtain variant enzymes where multiple kinetic parameters are enhanced simultaneously, resulting in enzymes with optimal activity towards the desired substrate. Furthermore, while identification of MsIREDC by Borlinghaus and Nestl36 required multiple rounds of mutagenesis and screening, we obtained the superior NAD-dependent variant MsIREDS in a single round.
Chemically-directed evolution of enzyme variants with modified chemoselectivity and positional selectivity
We aimed to apply the artificial selection pressure to target the evolution of enzyme activity towards non-native chemicals. Nitroreductases can synthesize aromatic hydroxylamines or amines (pharmaceutical and agrichemical precursors) from low-cost and readily available nitroaromatics37, can activate nitroaromatic anticancer prodrugs38,39, and can be used for bioremediation of soils contaminated with explosives such as TNT40,41. However, these applications often require the tailoring of natural nitroreductases to improve their catalytic properties. We targeted the classic Enterobacter cloacae nitroreductase NfsB (EntNfsB), a type-I nitroreductase able to reduce several nitroaromatic compounds, including 4-nitrobenzoic acid (4-NBA), and which can accept NAD and NADP as cofactors42. We characterized the activity of the WT enzyme, EntNfsBWT, with 4-NBA, 4-nitrobenzyl alcohol (4-NBALC), and 2-nitrobenzoic acid (2-NBA), finding that it was able to reduce 4-NBA and 4-NBALC but not 2-NBA (Fig. 5a, Supplementary Fig. 6, and Supplementary Table 2). The activity towards 4-NBALC was, however, considerably lower than towards 4-NBA, which prevented the accurate determination of kinetic parameters.


a Catalytic activities of EntNfsB (Enterobacter cloacae nitroreductase NfsB). EntNfsB is able to reduce several nitroaromatic compounds, such as 4-nitrobenzoic acid (4-NBA), using NADH or NADPH as cofactors. EntNfsBWT is also able to reduce 4-nitrobenzyl alcohol (4-NBALC) less efficiently, but does not display any activity towards 2-nitrobenzoic acid (2-NBA). b Substrate binding site of EntNfsBWT with 4-NBA bound (PDB code 5J8G). Residues 40, 41, 68, and 124 were close to the substrate, and did not contact the FMN group essential for catalysis. Anaerobic culture of AL cells transformed with the library EntNfsBLib (pLS169) in M9 glucose medium with 2-NBA (c) or 4-NBALC (e). Controls are as in Fig. 1. d Anaerobic culture of AL cells transformed with EntNfsBS1 (with substitutions S40A, T41I, and F124A; pLS169_1), the variant with activity towards 2-NBA, in M9 glucose medium supplemented with 2-NBA. f Anaerobic culture of AL cells transformed with EntNfsBS2 (with substitutions T41L, Y68L, and F124L; pLS169_3), the variant with improved activity towards 4-NBALC, in M9 glucose medium supplemented with 4-NBALC. Data points of growth curves represent mean values, with error bars showing standard deviation; n = 3 biologically independent cultures for all timepoints of growth curves. Source data are provided as a Source Data file.
We performed full combinatorial saturation mutagenesis of residues S40, T41, Y68, and F124 of EntNfsB (generating a library (EntNfsBLib) of 16.8 million unique genetic variants), chosen based on their proximity to the substrate-binding pocket and their lack of direct contact with the essential FMN43 (Fig. 5b). AL cells were transformed with the library and cultured anaerobically in a minimal medium supplemented with 2-NBA or 4-NBALC (Fig. 5c, d). We identified a different variant from the cells grown in the presence of each of the substrates. The variant selected in the cultures with 2-NBA (EntNfsBS1) contained substitutions S40A, T41I, and F124A, while the variant selected in the cultures with 4-NBALC (EntNfsBS2) contained T41L, Y68L, and F124L substitutions. In order to determine the product generated by the EnfNfsB variants, the fermentation broth of AL cells transformed with plasmids expressing either EntNfsBS1 or EntNfsBS2 and grown anaerobically in minimal medium with 2-NBA or 4-NBALC (Fig. 5e, f) respectively was characterized by 1H-NMR (Supplementary Table 1). Interestingly, the observed product signals matched those of the aromatic amines corresponding to 2-NBA or 4-NBALC, despite a previous report indicating that EntNfsBWT can only reduce 4-NBA to 4-hydroxylaminobenzoic acid, and not all the way to the amine 4-aminobenzoic acid42. That report only assayed the in vivo activity up to 24 h. It is possible that the highly reducing conditions inside AL cells, combined with longer periods of incubation under anaerobic conditions, allow for the formation of the amine products. Enzymatic assays confirmed that EntNfsBS1 was able to reduce 2-NBA (Km = 0.808 mM, kcat = 28.4 min−1 and kcat/Km = 585 M−1 s−1), and that EntNfsBS2 could reduce 4-NBALC more efficiently than WT EntNfsB, allowing determination of kinetic parameters (Km = 1.111 mM, kcat = 205.0 min−1 and kcat/Km = 3075 M−1 s−1, Supplementary Fig. 6, Supplementary Table 2). Both variants retained the ability to reduce 4-NBA. Therefore, EntNfsBS1 is active on both 2-NBA and 4-NBA, unlike any previously reported nitroreductase. The ability to engineer nitroreductase selectivity and promiscuity in this way could prove useful in applications including activation of anticancer prodrugs with multiple nitro groups, and bioremediation of soils contaminated with nitroaromatics, where a complex mixture of different compounds is often found, and therefore the ability to act on multiple isomers is desirable. The isolation of EntNfsBS1 and EntNfsBS2 demonstrates chemically-directed evolution as a powerful application of the artificial selection system, using an external supply of target substrates to direct both improvement of an existing poor activity, and acquisition of an activity with a non-native substrate.
Coupling artificial selection to genetic design optimization yields the best-performing synthetic isopropanol pathway
We anticipated that the artificial selection system should be readily applicable to more complex systems than individual enzymes, such as metabolic pathways. Furthermore, the selection pressure should act not only on the sequences of enzymes, but in general on any genetically-encoded trait that can be linked to the generation of NAD+, such as regulatory sequences controlling gene expression. These sequences, particularly promoters and ribosome binding sites (RBSs), are important in the design and optimization of heterologous/synthetic metabolic pathways, as they control the amount of each enzyme that is produced. To maximize production, high yet balanced flux across the pathway is required, while avoiding the accumulation of intermediates44,45 and expression-associated metabolic burden46. Finding the best combination of regulatory elements for a specific pathway is not trivial, often requiring multiple rounds of laborious and time-consuming screening.
Isopropanol is a widely used solvent, additive, and platform chemical with a large market, conventionally manufactured by petrochemical routes. We sought to develop an isopropanol production pathway and prototype whole-cell biocatalyst superior to the natural and engineered examples reported previously47 (Fig. 6a). A library (MPLib) of variants of pathway-encoding plasmids, differing in the regulatory elements (promoters and RBSs) controlling each gene, was constructed by combinatorial DNA assembly using the Start-Stop Assembly48 system (Fig. 6b), giving a library of 60.4 million different variants. While only one enzyme in the pathway (CBADH) directly generates NADP+ (converted to NAD+ by transhydrogenases), the rate depends on the flux through the entire pathway, which in turn depends on a combination of regulatory genetic elements giving optimal expression of all enzymes, without causing problematic overexpression. This coupling allows the artificial selection pressure to be applied to the identification of optimal genetic designs.


a A synthetic isopropanol production pathway for E. coli. Thiolase (acetyl-CoA acetyltransferase) THL (encoded by atoB) catalyzes the condensation of two acetyl-CoA molecules to one acetoacetyl-CoA, releasing one free CoA molecule. Acetoacetyl-CoA transferase ACoAT (comprising two subunits encoded by atoA and atoD) transfers the CoA group from acetoacetyl-CoA to acetate, yielding acetoacetate and acetyl-CoA. Acetoacetate is decarboxylated by acetoacetate decarboxylase ADC (encoded by adc), resulting in acetone and CO2. Finally, acetone is reduced to isopropanol by CBADH. The acetate consumed by ACoAT can be regenerated from the resulting acetyl-CoA by phosphate acetyltransferase PTA and acetate kinase ACK. The NADP+ generated during the reduction of acetone can be employed by transhydrogenases STHA and PNTA to regenerate the NAD+ required to sustain anaerobic growth. b A library of isopropanol pathway-encoding plasmids was combinatorially assembled using mixtures of promoters and RBSs, expected to cause a diversity of performance in terms of flux, accumulation of intermediates, and expression-associated metabolic burden. 10 ‘S’ pathway variants were isolated by redox selection in E. coli AL cells and 10 ‘R’ pathway variants were isolated at random. Sequencing was used to determine the parts present in each case, which are shown aligned below the corresponding part in the design. Only two different permutations of parts were identified among the S variants, whereas all 10R variants were unique. Two R variants were defective, lacking one or more expected parts, whereas all S variants were complete. For comparison of isopropanol production, AL cells containing the S and R variants were cultured in M9 glucose medium for 17 h, then individual and average isopropanol titers in culture broths of both groups were determined. For both variants obtained by selection and variants picked at random, n = 10 biologically independent samples. In each set, black bars represent individual biological replicates, and the red bar represents the mean value with standard deviation indicated by the error bar obtained from the corresponding set of individual biological replicates. Source data are provided as a Source Data file.
AL cells were transformed with MPLib and grown anaerobically on minimal medium agar plates, with gluconate as a carbon source instead of glucose, because the maximum theoretical yield of isopropanol from gluconate is greater. Colonies of transformed AL cells were visible after 65 h of anaerobic incubation. We verified the ability of 10 colonies to grow anaerobically after inoculating them into minimal medium liquid cultures, and confirmed the presence of isopropanol in the fermentation broth by 1H-NMR (Supplementary Table 1).
We compared the combinations of regulatory elements and the isopropanol titer of 10 ‘S’ variants obtained by artificial selection with those of 10 ‘R’ variants picked from the library at random without artificial selection. We found only two different combinations of regulatory elements among the S variants (MPS1 and MPS2), whereas the 10R variants were all different, and included defective variants lacking one or more of the genes, which can arise during DNA assembly (Fig. 6b). In these minimal culture conditions, the S variants gave an average isopropanol titer of 4.97 mM after 17 h of incubation, which was significantly greater (t(9) = 9.22 and p = 1.54 × 10−8, or excluding defective variants t(9) = 6.82 and p = 4.19 × 10−6) than the average of 0.60 mM produced by the R variants (Fig. 6b). This indicates that the artificial selection pressure had acted both to eliminate defective variants and to favour specific combinations of genetic elements leading to maximized production of isopropanol.
We compared the performance of the best isopropanol pathway variant MPS1 with the best isopropanol pathway reported previously, by Hanai and coworkers47, using the same growth medium and culture conditions. The previous best 43.5% of maximum theoretical isopropanol yield during the production phase (with 45 mM titre) was surpassed here by WT E. coli cells transformed with MPS1, which achieved 56% at the same point (with 62 mM titre), the highest isopropanol yield reported so far for any organism, including both native producers and engineered strains.

