The concept of the reaction graph
Our reaction graph concept entails simple representations of functional hairpin-based nucleic acid isothermal amplification mechanisms, enabling scientists to quickly establish the main reactions and the amplicons, with the ability to design strategies addressing situations that are currently challenging when using existing methods. In order to achieve this goal and demonstrate the amplification methods for developing basic DNA nodes and circuits, the process was simplified into five components (Fig. 1a): the functional hairpin motif self-folding assembly process (solid arrows); the primer extension process (dashed arrows); the formation of functional hairpin motifs generated through amplification (blue box); numeral marked primers and functional hairpin motif self-folding products (green triangle for forward strands and green square for reverse strands).


Programming a nucleic acid isothermal amplification pathway as a reaction graph in which the hairpin complex acts as the initiator and encodes two priming sites in the loop region: (a) A schematic for generating the secondary structure mechanism and reaction graph from the functional hairpin motif-mediated isothermal amplification. The letters a/s denote the complementarity of the nucleic acid sequence and the numbers represent specific fragments in the functional hairpin structure (e.g. 2a is complementary to 2s). During their amplification, the functional hairpin motif double strand product (blue box) disassembles (solid arrows) to form sense (green triangle) and anti-sense (green square) strand products. The formation of self-folding products exposes the priming sites which leads to the hybridisation of the primers and extension of the sequence by the strand-displacement DNA polymerase (dashed arrows) to generate a double strand amplicon. The amplicon then becomes the target (or initiator) for further amplification cycles via circuit S and A; b The reaction kinetics of the mechanism described in Fig. 1a with different lengths of loop fragment from 40 nt (black squares), 50 nt (red disk) and 60 nt (blue triangle). Data are presented as mean values; error bars are standard deviation, N = 3 independent experiments; c Sequencing of product (3a + 1 s + 2 s + 3 s) where the amplification products were digested (lane 1 in (d)), cloned and sequenced, identifying as tandem repeats; d Agarose gel electrophoresis of the amplification products. Lane L: 50 bp ladder; Lane 1: amplification product digested by the restriction enzyme Xba I. Lane 2: amplification product (with initiator); Lane 3: amplification product (without initiator). The gel is representative of experiments repeated independently three times with similar results. Source data are provided as a Source Data file.
We built the basic nodes and circuits around two core isothermal amplification reaction schemes performed using a ssDNA containing a hairpin motif (green triangle), with one priming site at the loop domain, as illustrated in Fig. 1a. The amplification process is described via the state of each port in a reaction graph12 and the secondary structure formation mechanism, simultaneously, also shown by Fig. 1a. As the primer that encodes the complementary sequence of the loop region (2a in circuit S and 1s in circuit A of Fig. 1a) hybridises to the loop domain, it initiates an extension reaction that opens the hairpin structure with a strand-displacing DNA polymerase (dashed arrows). The newly exposed domain (3s) at the stem serves as an initiation site that permits the generation of a double strand product, represented by the functional hairpin motif in the blue box.
Subsequently, an autonomous self-folding disassembly reaction of the functional hairpin motif occurs (solid arrows of Fig. 1a) to form two self-folding products (green triangle and green square). The formation of these intramolecular structures then separates the dsDNA product into two appropriate hairpin structures, exposing the priming sites. When primers bind to these exposed sites, the two complementary self-folding molecules serve as targets to start a new round of catalytic amplification and thus achieve signal amplification. The closure of the circuit between the blue box and triangle/square, Fig. 1, defines the continuation of the amplification process.
Mechanistic studies
In order to understand the mechanism and thermodynamics of the self-folding of hairpin structures and the competing folding of DNA oligonucleotides with partially complementary sequences, studies were performed using melting analysis to characterise the dependence of each phenomenon with the concentration of the constituent molecules13. To illustrate this behaviour in our system, we explored the thermodynamics of the amplification reactions, monitoring the melting temperature, Tm, of reactions involving oligos H1 and H2, which can both bind to each other and self-fold as hairpins individually (see Supplementary Fig. S1). Building upon our general design rules, established previously14, we are able to use the hairpins as a fuel for amplification reaction (noting improved stability when the neck is long, Supplementary Fig. S2).
The reaction graph, described in Fig. 1a, allows the generation of size-limited amplicons, which was confirmed by the smaller bands in Fig. 1d, showing the different structures predicted by the mechanisms as designed (and confirmed by sequencing in Fig. 1c). However, the longer structures are not predicted directly, as is often the case in isothermal amplifications (including LAMP, for example). The longer structures arise as a consequence of interactions of Bst polymerase, primers and amplification product, as described recently for LAMP using high resolution melting analysis15, and demonstrated here by restriction digest sequencing in Supplementary Fig. S3.
Computational design tools
The design of functionalities into an isothermal amplification system requires significant expertise16 and is often an iterative approach. Our reaction graph abstraction method provides the means to enable the design of the sequences necessary for an assay, based on relationships between different sequences of building blocks, enabling us to automate the compilation of the functional motifs for hairpin-mediated isothermal amplification. We were able to demonstrate an automated primer design system which only requires as its input specific target sequence information, but which provides as an output ranked sets of primers and an isothermal amplification process with specified dynamic behaviours, Fig. 2.


Flow chart of primers design software including (1) screening target sequences, (2) randomly designing and screening US sequences, and (3) filtering and outputting the final primers according to the four different schemes (see details in Supplementary Text).
We developed such a program (described in detail in the Supplementary Text) for a user to input a target sequence (e.g. from a candidate pathogen in a diagnostic question) and obtain primer sequences to enable detection through amplification. The program is extensible, designed as a flexible tool for reaction graph primer design and can meet various design requirements in a high-throughput informatics environment. It is implemented in Golang, which can be deployed on major operating systems (Windows, Linux, Mac), to achieve high throughput analysis based on multithreading, high running efficiency, native high concurrency and powerful fault-tolerance.
To simplify the primer design discussion, we refer to different primer fragments as sequences (such as 1S/2S/US), as shown in Fig. 1. As an example of the flexibility of the approach, Supplementary Fig. S4 shows how a similar reaction graph to that used in Fig. 1 can lead to a different amplification mechanism by simply removing a priming site in between the hairpin structures. The basic model of a hairpin structure-based isothermal amplification mechanism described in Fig. 1 and Supplementary Fig. S4 encompasses methods depicted for both CPA4 and LAMP5, as illustrated in Supplementary Fig. S5.
We now demonstrate the utility of these basic models by experimentally executing two different and nucleic acid isothermal amplification strategies, each of increasing complexity, and each illustrating the different mechanisms by which the interactions within the functional motifs catalyse the amplification reactions.
Strategy 1. Generic tail strategy
Our first strategy involves the design of a functional hairpin motif through sense and anti-sense primers by encoding a synthetic generic tail (termed “us”, where “u” is for “universal”) with the same sequence at their 5′ end, but with their 3′ ends complementary to the specific target sequence, Fig. 3a. The amplification pathways specified in the reaction graph of Fig. 3a was translated into the secondary structure-based molecular implementation, Fig. 3b. These primers coexist in the absence of the target. In the reaction graph, this property was programmed by the absence of a starting point (i.e. the blue box was removed from the graph). The introduction of the target led to the generation of the amplicon containing the functional hairpin motifs (us+1s+2s+ua), followed by serial self-folding and primer extension by DNA polymerase. Although not mechanistically required, we also added two outer primers (F3, B3). In a similar strategy as for their use in conventional LAMP systems, they increase the kinetics of the reactions by providing an additional route to obtain the functional amplification products, which then enable the exponential amplification phase.


The functional hairpin structure (us+1s+2s+ua) is formed after the target is added, starting the amplification process. b The secondary structure mechanism of the process described in (a). The primers with the generic tails are 1s+Us and 2a+Us. c Sequencing of functional hairpin structure (us+1s+2s+ua). (TCTAGA) and (TTCGAA) were inserted into the primers as markers. d Effect of template concentration on amplification. Agarose gel electrophoresis of the products as a function of target concentration. L, ladder; lane 1, 40 M. tuberculosis cells; lane 2, 4 M. tuberculosis cells; lane 3, 0.4 M. tuberculosis cells; lane 4, buffer (45 min reaction). The gel is representative of experiments repeated independently three times with similar results. e, f Application to the detection of mir21 microRNA (sequence 1s2s). e Real-time amplification curves showing concentration dependant detection for 10× serial dilutions from 20 pM (orange) to 2 fM (light blue). Negative control shown in grey. Inset is average threshold time (Tt) for different concentrations of mir-21 (n = 3 independent experiments, error bar is standard deviation) (f) Confirmation by gel electrophoresis: Lane 1 is 20 pM, 2 is blank (1 h reaction). The gel is representative of experiments repeated independently three times with similar results. g Specificity of the detection of miR-7 against other members of the family. (concentration: 100 fM). In black squares – three replicates of miR-7a; miR-7b (red); 5. miR-7c (grey); 6. miR-7d (blue). Left y-axis indicates the normalised amplification signal of miR-7a, while the right y-axis applies to the amplification signal of other microRNAs. Source data are provided as a Source Data file.
Gel electrophoresis, Fig. 3d, confirmed that no amplicon was generated in the absence of the target, whilst, on addition of the target, the amplification was initiated. DNA sequencing of the amplicon revealed the expected functional motifs, Fig. 3c. This system can be used to amplify sequences from pathogens (as templates) with short, highly conserved sequences between 40 and 60 bp.
Developing assays for shorter nucleic acid sequences (down to ca. 20 nt)17 has been challenging using existing assay mechanisms10, often requiring additional steps to extend the target sequence, as is the case in Exponential Isothermal Amplification (EXPAR18), ligation-LAMP11 or RCA19. To this end, we demonstrate that our generic tail strategy (Fig. 3) can be directly applied to the detection of such small sequences and illustrate this capability with the detection of miR21 microRNA in a concentration dependant manner, down to fM concentrations (Fig. 3e), confirmed with a gel electrophoresis analysis (Fig. 3f). The limit of detection was estimated between 2 and 1 fM with 8 replicates, all detected at 2 fM (54.6 ± 1.2 min), whilst only half (4/8) were detected at 1 fM (56.2 ± 0.6 min) (Supplementary Table S2).
Part of the central challenge in this is that microRNAs are generally contained within families of closely related sequences with different biological functions20. Our mechanism was able to differentiate the miR21 from miR21-A, which differs by only 2 bases (Supplementary Fig. S6), a feat which is most usually challenging using standard isothermal assays such as LAMP of RCA21. We also used the wider family of miR-7 and demonstrated that the assay was able to detect miR-7-a from other members of the same family at low concentrations, often required for detection in clinical samples (Fig. 3g).
Expanding further on the capabilities of our assay design to detect small miRNA sequences, we also demonstrate that this can be carried out in a multiplexed fashion for different (but closely related) targets (Supplementary Fig. S7). The universal tail strategy can be extended with different recognition sequences, coding for different target molecules (Supplementary Fig. S7b). Existing isothermal systems have been modified to tackle this complex challenge (such as RPA22 or EXPAR23), however here we show that our strategy, involving using a universal tail, can be used to enable this design quickly and efficiently. We also demonstrate that detection can be incorporated into lateral flow strips for easy visual output (as could be required for example in resource-limited settings—Supplementary Fig. S7d).
In order to further illustrate the effectiveness of multiplexing strategies, we also explored the design and implementation of multiplexed reaction graphs to detect different sequences in a single, long, nucleic acid sequence24 (Fig. 4), for example different regions of the same genome, in this case HIV. The design of such assays, particularly for the detection of such RNA viruses, requires to overcome their high frequency of mutations, which often results in limited conserved regions25. We provide a strategy whereby two sequences initiate the same analytical output, so improving the efficacy of the assay, with the amplification and diagnostic output proceeding when either one or both sequences are introduced in the assay (as might be needed for a multiplex assay when two or more variants are present).


a Reaction graph. The primers P1-4 allow cross-amplification of 2 pathways (see Supplementary Fig. S8 for detailed functional motifs). b Agarose gel electrophoresis of the amplified products. Lane L—25 bp ladder; Lane 1—target 1, lane 2—target 2; lane 3—both targets 1 and 2; lane 4—no target. The gel is representative of experiments repeated independently three times with similar results. Supplementary Fig. S8 also shows the results of the sequencing of the amplification. Source data are provided as a Source Data file.
Figure 4a shows such a design of cross primers, which contain three different constructs: (i) the 3′-end (1s in the example of P1) is complementary to the conserved region of the target (1); (ii) a sequence downstream, towards 5′ (2a of P1), complementary to a priming site on the opposite strand of the same target (2s, the priming site for P2); and (iii) an incorporated sequence (3s in P1), which is the same as the priming site of another cross primer (3s of P3) that can amplify another conserved region (2) from the target. During the amplification cycling steps, the products from the cross primers formed the functional motifs. The functional motifs from Region 1 exposed the priming sites common with Region 2, which enabled the primers for Region 2 to bind and be extended by DNA polymerase, and vice versa.
The reaction graph also enabled us to predict the sequences of the amplification product and functional motifs. In the example of HIV, we used conserved regions in the gag and pol gene. Figure 4 demonstrates the ability of the reaction to generate amplification when one, or the other, or both targets are present. The predicted sequences of the amplicons were confirmed with DNA sequencing (Supplementary Fig. S8), although similar high molecular weight structures as described in Fig. 1c are also observed.
Strategy 2. Progressive model
Our second strategy sought to overcome the increment in complexity in a reaction pathway, which arises when linking an additional reaction pathway onto a primary pathway, for example where one product of an amplification pathway serves as an input into a second reaction (e.g. resulting in two amplification reaction systems, as in e.g. EXPAR, the first one being linear and the second one exponential26). Figure 5 shows the implementation of such a step change in the complexity of design by using the intermediates as the links between different reaction pathways. In these cases, the primer extension generated many products containing functional motifs which were used as targets in the further amplification process. All pathways described here start with the addition of genomic DNA.


a Reaction graph. Multiple primer extension arrows entering the same circuit, mediated by (3a+1s+2s+3s, 3a+1s+2s+3s+4s, and 3a+1s+2s+3s+4s+5s). b Real-time amplification curve of the progressive model with 10-time serially diluted target: 1. 400 M. tuberculosis cells (red square); 2. 40 M. tuberculosis cells (blue circle); 3. 4 M. tuberculosis cells (yellow up triangle); 4. Negative (black down triangle); 1–3 are normalised fluorescence intensity ([0-1], left axis), whilst 4 is raw fluorescence intensity (right axis). The samples were purchased extracted genomic DNA from a reference laboratory, characterised by digital PCR (see Methods). c System kinetics with different number of anti-sense primers examined by real-time amplification: 1. 3a+4a+5a (blue dot); 2. 3a+4a (red solid); 3. 3a only (black dash). Threshold time (Tt) is the time corresponding to 20% of the maximum fluorescence intensity in the reaction. Data are the average of three technical replicates and error bars show the standard deviation; d Agarose gel electrophoresis demonstrating the effect of different numbers of anti-sense primers added to the reaction. 1/2. 3a+4a+5a; 3/4. 3a+4a; 5/6. 3a only. 1/3/5 were performed with 40 M. tuberculosis cells; 2/4/6 were negative. The gel is representative of experiments repeated independently three times with similar results. Supplementary Fig. S9 shows the secondary structure mechanism and the sequencing results of the product. Source data are provided as a Source Data file.
By programming this progressive model into the reaction graph abstraction of Fig. 5a, we obtained an isothermal amplification system with a sense primer (3a+1s) with the 3′ end complementary to the target sequence. The 5′ end was complementary to a sequence that lies downstream on the same strand of the target and anti-sense primers (2a, 3a, 4a and 5a) complementary to the individual sequences of the target, Fig. 5a. In the corresponding molecular implementation, all the primers coexisted in the absence of target molecules. The target catalyses the generation of three different functional hairpin motifs (3a+1s+2s+3s, 3a+1s+2s+3s+4s and 3a+1s+2s+3s+4s+5s) which all contained the same functional hairpin motif (3a+1s+2s+3s).
Such functional hairpin motifs tend to form intramolecular hairpin structures after self-folding and are stabilised by the paired, helical structures that form between 3a and 3s (at the 3′ end). The antisense strand product can therefore be extended after self-folding from the 3a site by adding a 4a or 4a+5a sequence at the loop region. During the primer extension, primers 2a/3a+1s bound to the loop region of sense and anti-sense strand, respectively, formed functional motifs, and led to the opening of the hairpin structure after extension, exposing the priming sites for other primers.
Critically, the formation of intermediate structures (as 3a+1s+2s+3s+4s and a hairpin motif 3a+1s+2s+3s+4s+5s) drives an exponential amplification process, powered by this progressive model. The hairpin motif (3a+1s+2s+3s+4s+5s) has two functions, namely: (i) with primers 2a/4a and 2a/3a, it catalyses the generation of 3a+1s+2s+3s+4s and 3a+1s+2s+3s in circuit S1 (and similarly in A1 with the respective primers); and (ii) with primers 2a/5a, it produces a new copy in both circuits S1 and A1.
Similarly, the motif 3a+1s+2s+3s+4s leads to the production of 3a+1s+2s+3s with the primers 2a/3a in circuit S2 (or in circuit A2 with primer 3a+1s), and a new copy of itself with the primers 2a/4a and 3a+1s/4s in circuit S2 and A2, respectively. Hence, 3a+1s+2s+3s+4s and 3a+1s+2s+3s+4s+5s form a progressive process capable of inputting another copy of the target motif to parallel amplification processes thus increasing the amplification efficiency, Fig. 5c. The amplification pathways specified in the reaction graph of Fig. 5a are thus translated into the secondary structure-based molecular implementation of Supplementary Fig. S9.
The sensitivity of this progressive model was evaluated in the design of an in vitro diagnostic assay using tenfold serially diluted target of the genomic DNA from M. tuberculosis cells (400 to 4 cells per reaction). Evagreen fluorescent dye was used to monitor the formation of double-stranded structures and thus characterise the kinetics of the reactions, Fig. 5b, showing faster reactions as the target concentration was increased.
The implementation of the different pathways simultaneously was gated by the presence of different anti-sense primers. Using the same real-time approach as in Fig. 5b, we defined a threshold time (Tt) and compared different reaction velocities by using a measure of the time when the fluorescence signal reaches 20% of its maximum (by analogy to the cycle threshold of real-time PCR). Figure 5c confirms that as more anti-sense primer sequences were introduced, different amplification pathways were enabled, leading to exponential reaction kinetics. Systems with 5a+4a and 4a were 13% ± 6 and 10% ± 3 respectively faster than when using 3a only, as their presence led to more opportunities to form more functional hairpin motifs. We confirmed the formation of specific functional hairpin motifs (3a+1s+2s+3s) and (3a+1s+2s+3s+4s) in the reaction graph (Fig. 5a) by agarose gel electrophoresis (Fig. 5d) and sequencing (Supplementary Fig. S9). Subsequent analysis of the amplification products by sequencing show that these large amplification products are composed of tandem repeats of the expected functional domains (the full mechanism is presented in Supplementary Fig. S9). This system can be used to amplify the DNA of pathogens with long, highly conserved sequences.
We further explored the applicability of our two de novo designs (Figs. 3a, 5a) by investigating their translation into medical diagnostics, processing 54 clinical samples from Hepatitis B virus patients, with the results shown in Table 1 (individual values are presented in Supplementary Table S3). Our methods showed the results had a high coincidence (96.3%), when compared against the real-time PCR gold standard. Only two samples were missed in these systems with copy numbers at 6.35 and 14 respectively, below the PCR clinical sensitivity threshold. As with many isothermal amplification methods27, our mechanisms have shown kinetics that are concentration-dependant in well-controlled systems (e.g. artificial sequences). However, clinical samples contain large amounts of inhibitors and potentially more complex sequences which limit further the applicability of the techniques as quantitative methods, for example for viral load measurements in HBC patients. In these applications qPCR should be used as the gold standard method.

