A high-throughput assay to globally identify PUF variants with new specificities
To quantify the interaction between a PUF domain and its RNA target, we applied the yeast three-hybrid assay37 (Fig. 1B). In this assay, the interaction of the PUF domain with an RNA leads to the activation of the yeast reporter gene HIS3, such that the cells survive in selection media without histidine and containing 3-amino-1,2,4-triazole (3-AT), a competitive inhibitor of His3. In order to facilitate the identification by next-generation sequencing of both a protein variant and its RNA target, we encoded both components on a single plasmid. This “all-in-one” plasmid includes both a protein module encoding the human Pumilio-1 PUF domain fused to the Gal4 activation domain (PUF-AD) and an RNA module encoding the Pumilio-1 RNA recognition sequence, the nanos response element (NRE)38, fused to the MS2 coat protein binding sequence (Fig. S1A). We chose the NRE element UGUAAAUA as the starting recognition sequence rather than a sequence with U or C at position 5 (ref. 26), as UGUAAAUA is the most common Pumilio-1 binding motif based on a comparative analysis of mRNA targets for the human PUF family proteins39, a structure exists for human Pumilio-1 bound to this sequence30, and A was one of three preferred bases at that position when Pumilio-1 binding was tested in the yeast three-hybrid assay (Fig. S2).
The use of an “all-in-one” plasmid also simplified plasmid recovery from yeast containing a library of PUF domain variants. We performed six tests of the plasmid with combinations of RNA and PUF domain in the yeast strain YBZ-1, which constitutively expresses the LexA–MS2 coat protein fusion and carries the HIS3 reporter gene under the control of multiple LexA operators. We found that the all-in-one plasmid performed similarly to the two plasmids of the original yeast three-hybrid system (Fig. S1B; 1 vs. 2), and that high copy and low copy versions of the plasmid also worked similarly (Fig. S1B; 2 vs. 3). In the last three tests, we swapped the TRM combination in two repeats, or we swapped two RNA bases, or both. Either swap should eliminate RNA–PUF domain binding, which was what we observed (Fig. S1B; 4, 5, and 6).
We tested the PUF domain against RNA targets that sequentially contained each of the four RNA bases, with the other seven positions being the wild-type base. The base-specific binding pattern for each PUF repeat could be recapitulated in this system (Fig. S2). Only base 5, an adenine in the Pumilio-1 target sequence, showed a broader specificity, generating a three-hybrid signal when either adenine, cytosine or uracil was present. This broader specificity has been observed previously39,40,41. Overall, these results confirmed that the yeast three-hybrid assay can be used to analyze a PUF domain binding to its RNA target.
To elucidate the RNA-binding preferences of a large number of PUF variants in a single culture, we combined the yeast three-hybrid system with next-generation sequencing. For each of the eight repeats of the PUF domain, we generated a library of all possible TRM domains. Each library was encoded on a plasmid that also carried the Pumilio-1 target RNA sequence with any of the four possible RNA bases present at the cognate position of the 8-base binding site. Thus, each of eight separate three-hybrid selections tested a single TRM library of the PUF domain against a target RNA sequence with a single base varied. To identify protein-RNA interactions by single short reads of Illumina sequencing, we designed the TRM libraries to carry synonymous changes in codons adjacent to the randomized TRM codons. For each TRM library we used four sets of synonymous changes, which informed the identity of the cognate RNA base that was varied (see “Methods”); the synonymous changes were likely to have a negligible effect on protein function.
The PUF domain variants were designed to contain the 8000 possible combinations of amino acid substitutions at residues 12, 13 and 16 through NNK libraries at each position (N = A/C/G/T and K = G/T, Fig. 1C). We selected for the ability of each repeat to interact with RNA by carrying out the histidine selection on plates with SC-Leu-His + 0.5 mM 3-AT media, a 3-AT concentration chosen based on the pilot selection (Fig. S1B). We retrieved the library from both input and post-selection pools, and determined the frequency of each variant in both pools by high-throughput sequencing. The log2 change in the frequency from input to selection pool serves as a measure of binding activity for each PUF variant, designated as a PUF domain–RNA interaction score in this assay.
Based on enrichments in the post-selection pool, we scored the RNA-binding activities of 169,587 PUF domain variants. This dataset contains 24,751 nonsense variants and 144,836 missense variants from the eight repeats. The interaction score distribution of all variants revealed that, in general, nonsense PUF variants were deleterious for interaction with any RNA sequence, and missense PUF variants were present as a bimodal distribution (Fig. 1D). Some nonsense variants had scores that indicated they were enriched, which may result from experimental noise, as routinely seen in other deep mutational scanning experiments42; the nonsense variants with these enrichment scores had significantly lower input reads than other nonsense variants. The use of these scores allowed us to calculate a false positive rate for loss-of-function missense mutations. We found that 1.4% (45/3193) of nonsense variants had an enrichment score >0, providing an estimate of the fraction of the loss-of-function missense variants that were also false positives.
The PUF domain–RNA interaction score for each PUF variant showed a high degree of overlap between two experimental replicates (Fig. S3; Pearson correlation coefficient ranged from R = 0.952–0.982). We assigned a specificity score for each PUF variant as the difference between its highest and second-highest interaction score. Using a threshold of interaction score >5, and a specificity score >4, we identified many PUF variants with highly specific interactions (Figure S4; the number of enriched PUF variants ranged from 5 to 79 across the eight repeats). For example, in repeat 1 (Fig. 1E), we found nine PUF variants specific for uracil (e.g., NWS, NFS), 11 PUF variants specific for cytosine (e.g., TFR, QFR), one PUF variant specific for guanine (SGD), and one PUF variant specific for adenine (IFV). For uracil recognition in repeat 1, we found that asparagine was the most preferred amino acid in position 12, and a polar uncharged amino acid such as glutamine, serine or threonine was the most preferred in position 16. Position 13 was enriched in the aromatic amino acids tryptophan and phenylalanine. While this pattern differs slightly from the optimal TRMs for uracil recognition (NXQ with X denoting T/H/F/Y)43, it recapitulates the general trend. Similarly, for cytosine recognition in repeat 1, arginine was the most preferred amino acid in position 16 and a polar uncharged amino acid (e.g., threonine, glutamine, asparagine) was preferred in position 1233,34.
Targeted screening of candidate PUF variants
Due to the large size of the libraries of randomized PUF variants, for many TRM variants, the initial yeast three-hybrid screen did not comprehensively recover a binding activity score against all four RNA bases and across all eight repeats. We thus conducted a targeted three-hybrid screen of promising candidate PUF variants. Using a threshold of interaction score >5, as well as specificity score >4, we chose for targeted oligonucleotide synthesis about 250 candidate PUF variants (along with negative controls of nonsense and missense variants) for each repeat (Supplementary Data 1; the number of variants ranged between 181 and 299) (Fig. 2A). We cloned each oligonucleotide pool into one of the eight PUF repeats to comprehensively survey the interaction of the candidate variants. For this experiment, we carried out 32 separate three-hybrid selections, consisting of the ~250 variants of a PUF repeat against one of four RNAs with a single cognate base varied. For each selection, to compare the binding of the wild-type Pumilio-1 domain across the four RNA bases, we spiked in the wild-type domain for normalization. We again collected plasmids from both input and post-selection pools and measured the change in frequency of each PUF variant by high-throughput sequencing.


A Workflow for the targeted screening experiment. The left panel shows examples of oligos designed for synthesis; the middle panel shows one example incorporated into the library repeat 1; the right panel shows a schematic of incorporation at all repeat locations. B A bar plot showing the fraction of targeted PUF variants recovered in each repeat. Purple indicates the number of synthesized PUF variants and green indicates the number of recovered PUF variants in each repeat. C A density plot showing the frequency of PUF variants in input and selected pool. Yellow indicates input library and blue indicates post-selection library. D The frequency distribution of nonsense and missense PUF variants for all repeat locations. The X-axis is a measure of PUF-RNA interaction score. Black indicates nonsense variants and red indicates missense variants.
For each repeat, we recovered between 64% and 95% of the synthesized PUF variants in the input pool (Fig. 2B), with each variant having a frequency centered on 0.1% (Fig. 2C). PUF variants were assigned an interaction score based on their enrichment in the post-selection pool. The distribution of interaction scores for all nonsense variants indicates that they were mostly deleterious, with an interaction score < −5. Consistent with the initial screen, targeted missense variants were enriched in the post-selection pool (Fig. 2D).
Based on the interaction scores against the four RNA bases, for each repeat, we clustered promising PUF variants with interaction scores >0. Of these, we identified variants with highly base-specific interactions for each of the eight repeats, and generated sequence logos for those PUF variants that had specificity scores >4 (Fig. 3). For clusters with more than ten variants, we subclustered the variants based on the properties of the amino acids across the three positions (e.g., positively or negatively charged or neutral) and generated separate sequence logos for each subcluster to summarize the base-specific recognition patterns. The top TRM combinations for each PUF repeat and each base in the cognate position are shown in Fig. 4. Comparing the specificity of each PUF variant across the eight repeats, we found that many base-specific recognition codes are not generic for all repeat locations, as previously reported34.


The color intensity represents the relative interaction score normalized by the maximal value for each row. Yellow indicates a high interaction score and blue indicates a low interaction score. Sequence logos summarizing the base-specific recognition patterns are shown nearby the heatmap for each repeat.


These TRM combinations follow one of three criteria: (1) used in the wild-type Pumilio-1 PUF domain, indicated by * in the figure; (2) highly specific in both the random screen and targeted screen; or (3) best represent the pattern of the sequence logo in Fig. 3. The rows indicate each repeat position of the Pumilio-1 PUF domain, and the columns indicate the four RNA bases at the cognate positions. Red, TRM combinations used in the PUF domain designs; —no highly base-specific TRM combination detected.
For G-specific binding, SNE is the natural code in repeat 7 (ref. 43). We found that this code specified guanine in repeat 3 as well (the interaction score for the other three bases was <40% of the score for G). If, instead, tryptophan was present in position 13 (SWD), G-specific recognition could be achieved in repeats 6 and 7. Moreover, if position 13 was glycine (SGD), G-specific recognition could be achieved in repeat 1, with SNE and SWD not found (Fig. 5). These results suggest that the combination of serine and a negatively charged amino acid (aspartate or glutamate) in positions 12 and 16, respectively, was a trend for G-specific binding across the majority of the repeats, with an aromatic amino acid (W/Y/F) in position 13 also affecting recognition specificity. In addition, in some repeats such as repeat 3 and repeat 6, a combination of threonine in position 12 and a negatively charged amino in position 16 (e.g., THE, Fig. 5) achieved guanine-specific binding.


The four TRMs shown for each base are representative of TRMs with different behavior in different repeats. The heatmap shows the normalized interaction score for each base. The color intensity represents the relative interaction score normalized by the maximal value for each row. Yellow, high interaction score; dark blue, low interaction score; white, missing data. The box above each heatmap panel indicates the base that the TRM combinations prefer.
For A-specific binding, the natural base-specific combinations (C/S)RQ28 were recapitulated as SRQ across many repeats (Fig. 5). However, in repeats 1, 2, 6, and 8, the combination of a valine or cysteine in position 12 and phenylalanine in position 13 (CFP, VFQ) was an alternative way to achieve A-specific binding (Fig. 5). While NHQ is a natural TRM combination that specifies uracil, replacing asparagine by proline in position 12 (PHQ) resulted in adenine specificity for repeat 3, 5, and 7 (Fig. 5). These results further indicate that base-specific combinations other than canonical codes can be identified.
For U-specific binding, the natural TRM combination NYQ was found highly specific in repeat 1, 3, and 7 (Fig. 5). Asparagine was preferred in position 12 (NHQ, NWP) across the majority of repeats (Fig. 5). For the middle repeats, a positively charged amino acid was more preferred than a polar residue in position 12 or 16 (e.g., RAN; Fig. 5). For position 13, aromatic amino acids such as phenylalanine or tyrosine were preferred. Even for canonical base-specific combinations, each repeat had its own preferences. For example, repeat 1 preferred NHQ rather than NWP, while the opposite was the case for repeat 2 (Fig. 5).
For C-specific binding, a polar, uncharged amino acid (e.g., glutamine or threonine) in position 12 and a positively charged amino acid (e.g., arginine) in position 16 (TFR, QFR, QWR) were the preferred combinations (Fig. 5). However, this preference was not uniform across the eight repeats. High specificity for cytosine was found only in the more N-terminal repeats, such as repeat 1 or 3, and was markedly reduced in more C-terminal repeats (Fig. 5). The same pattern was seen for the previously identified C-specific codes (e.g., SYR)33,34 as well, which potentially explains why C-specific combinations identified from repeat 6 in human Pumilio-1 did not confer this recognition to repeat 7 of C. elegans FBF-234,43.
Many combinations showed non-specific binding. For example, repeat 1 combinations with positively charged amino acids in position 12 and 16, repeat 3 combinations with negatively charged amino acids in position 12, and repeat 6 combinations with an aromatic in position 13 and an arginine in position 16 bound to more than a single base, with some of these combinations previously characterized as capable of non-specific RNA recognition27.
Binding of designed PUF domains against target RNA sequences
Given this new set of TRM combinations specific for each of the eight Pumilio-1 PUF repeats, we sought to determine their utility to bind in combination to RNA sequences possessing multiple changes to the UGUAAAUA wild-type binding site. Toward this end, we generated 16 8-base target RNA sequences that differ from the wild-type site by either one base (one target), two bases (two targets), three bases (eight targets), four bases (three targets), five bases (one target) or six bases (one target) (Fig. 6A). RNA targets with successively larger numbers of changes were generally devised to include changes put into the less heavily substituted targets. We tested the effects of RNA changes that require recognition by N-terminal and C-terminal repeats; single and consecutive (up to four base) changes; and changes that resulted in substitutions to A, C, G, and U.


A Target RNA designs, showing the substituted bases and designed TRM combinations in gray. B Sixteen three-hybrid selections were carried out in duplicate and interaction scores for each RNA in both replicates were plotted. The upper panel of each experiment indicates the locations and substitutions made in the PUF domain and target RNA. The lower panel is a correlation plot between the two replicates, with each dot an RNA sequence. Green indicates the RNA sequence recognized by the wild-type PUF domain; red indicates the target RNA sequence; gray indicates the top two enriched sequences based on the averaged interaction score from the two replicates, if these interaction scores are higher than the target RNA sequence. The X and Y axes indicate two replicates of the experiment. The sequence UCCGACUA was highly enriched in some of the replicates and was removed, and then the interaction scores of the remaining RNA sequences were plotted, which resulted in the different scales of the X and Y axes.
To assay binding to these targets, we generated an array of 1900 RNA elements (Supplementary Data 2). The array contained 20 copies of the wild-type sequence and 55 copies of each of the 16 target sequences (totalling 900), with the remaining 1000 sequences containing different percentages of the four bases in each location of the 8-base RNA sequences. For each base, we calculated these percentages to ensure that the oligo pool included related off-target sequences (see “Methods”). To synthesize the PUF domains, we chose TRM combinations identified as specific to each repeat, based on the results from both screens. For example, we chose TRM combinations SWD in repeat 6 and THE in repeat 3, as they contain the canonical guanine-specific recognition pattern of polar, uncharged amino acids in position 12 and a negatively charged amino acid in position 16, but they differ from the exact combinations in natural proteins. We chose other TRM combinations that were highly specific, but differed from canonical recognition patterns. For example, VFQ was the best combination for A-specific recognition in repeats 1, 2, 4, and 6; here we chose it for repeat 2. For cytosine recognition, we chose the most specific combinations QFR in repeat 1 and TFR in repeat 4.
We carried out 16 three-hybrid selections in duplicate corresponding to the 16 designed PUF domains, and calculated interaction scores for each RNA found in the input and selection pools. The sequence UCCGACUA was highly enriched in many of the selections independent of the TRM combinations that were substituted, suggesting that it resulted in reporter gene activation not due to a three-hybrid interaction. We thus removed this sequence and plotted the interaction scores of the remaining RNA sequences (Fig. 6B), with specific points labeled to show the target RNA, the wild-type RNA, RNAs differing from the target by one or two bases, and other RNAs that scored high for interaction.
For the three double PUF variants (including design 1, which has two TRM combinations changed but only one base changed in the target RNA), the target sequence was the most enriched RNA sequence (Fig. 6B, designs 1–3). For example, when repeat 3 was replaced with THE and repeat 4 with TFR, the target UGUACGUA was the most enriched RNA in both replicates (Fig. 6B, design 2). For the triple PUF variants (Fig. 6B, designs 4–11), the target sequences were enriched in half (four of eight) of the designs. For example, when repeats 1, 2, and 7 were replaced with QFR, VFQ, and NPG, respectively, the most enriched RNA sequence in both replicates was the target sequence UUUAAAAC (Fig. 6B, design 8). However, when repeats 1, 6 and 7 were replaced in a triple variant with QFR, SWD, and NPG, respectively, the target sequence UUGAAAUC had a low interaction score (Fig. 6B, design 11). For the quadruple PUF variants, none of them identified their target sequence as highly enriched (Fig. 6B, designs 12–14). Similarly, for the pentuple and sextuple variants, their targets were not among the top enriched RNA sequences (Fig. 6B, designs 15–16). However, in some cases, these highly mutated targets had higher interaction scores compared to the wild-type or many of the other RNA sequences (for example, Fig. 6B, designs 15–16).
In some cases, highly enriched RNA sequences that were not the targets matched part of the target sequence in a pattern that suggests the substituted repeats of the designed PUF domain were binding as designed. For example, in a triple variant with repeats 3, 4, and 7 replaced (Fig. 6B, design 7), the most enriched sequence matches six of eight bases of the target, including the three bases that were changed. For a quadruple variant with the TRM combinations substituted in repeats 1–4 (Fig. 6B, design 12), one of the most enriched RNA sequences was UUGACGAC. This sequence includes the four bases CGAC, which are the target sequence for the combination of four substitutions in repeats 1 to 4. However, the 5′-most four bases match only two of four bases of the target. Thus, this PUF design recognized all the substitutions in the RNA but no longer bound to all the remaining wild-type bases. For the sextuple variant (in repeats 1, 2, 3, 4, 6, and 7; Fig. 6B, design 16), the most enriched RNA sequence (UAGACGAA) includes five consecutive bases, GACGA, that match repeats 2, 3, 4, 5, and 6, corresponding to four of the substitutions in the RNA.
To determine whether flanking RNA bases beyond the 8-mer core provoked a register shift along a repeat that influenced the binding of the designed PUF domains, we compared the enrichment score of the target RNA to RNAs containing possible mismatched bases (Figure S5). For example, for any design, if a 1-base 5’ shift occurred in recognition, then the enrichment score of the designed 8-mer target would be similar to the three 8-mers that have the same seven 5′ bases and a different final base than the target in position 8; if a 2-base shift occurred, then the enrichment score of the designed 8-mer target would be similar to the nine 8-mers that have the same six 5′ bases and a different final base than the target in position 7 or 8. Similar considerations would apply at the other end of the 8-mer if 3′ shifts occurred. We plotted the enrichment scores of these alternative 8-mers and found no evidence that shifting occurred for designs that bound to their target sequences; shifting may have occurred for some designs, such as designs 13, 15, and 16, that did not bind to their target sequences (Fig. S5).
Results from these targeted RNAs and cognate substitutions in the PUF domain designs suggest two features that may hold more generally. First, N-terminal repeats (1–4) appeared to tolerate combinatorial variation better than C-terminal repeats (5–8). For example, target RNA sequences were highly enriched in several triple variants with two N-terminal substitutions, whereas they were not in triple variants with two C-terminal substitutions (Fig. 6B, designs 4–11). This finding is consistent with reports that UGUA, the cognate sequence for the four C-terminal PUF repeats, is a conserved binding motif for PUF domains from different species44,45. Conservation of the UGUA sequence may limit the ability of the C-terminal repeats to recognize alternative bases.
Second, these data suggest that N-terminal repeats engage in crosstalk with C-terminal repeats. For double PUF variants with substitutions only in N-terminal repeats (e.g., 3 and 4; Fig. 6B, design 2) or only in C-terminal repeats (e.g., 6 and 7; Fig. 6B, design 3), the most enriched sequences were the target RNA sequences. However, the addition of another substitution on the other side of the PUF domain resulted in triple variants that did not bind to their target RNA sequences (e.g., substitution in repeat 6 added to substitutions in 3 and 4 (Fig. 6B, design 6), or substitution in repeat 1 added to those in repeats 6 and 7 (Fig. 6B, design 11)). Similarly, for quadruple variants, consecutive substitutions at a single terminus (e.g., repeats 1, 2, 3, and 4; Fig. 6B, designs 12) functioned better than separate pairs of substitutions at both termini (e.g., repeats 1 2, 6, and 7; Fig. 6B, design 14). Mutations present at both termini may inhibit folding of the PUF domain.
In vitro binding of purified variant PUF domains
We sought to use electrophoretic mobility shift assays with purified PUF variants to quantify their binding properties and compare the results to the yeast three-hybrid results. HIS3 activity in the yeast assay correlates with biochemically measured protein-RNA affinity, but relatively small changes in Kd can cause substantial differences in 3-AT resistance46. We purified GST-fusions of PUF domains from E. coli using glutathione chromatography and removed the GST domain by protease cleavage. Initially, we examined the binding of our starting constructs, the wild-type Pumilio-1 PUF domain binding to the wild-type nanos response element UGUAAAUA (Fig. 7A, upper panel). The Kd estimated for this pair, based on half-maximal binding, was ~80 nM (95% CI 54–125 nM). This value is considerably higher than other estimates of Pumilio-1 binding, of around 1 nM41, and may reflect loss of activity during purification. Nonetheless, the assay using the wild-type protein represents a baseline for comparison to PUF domains with variant TRM combinations. The wild-type domain showed no binding to either of two RNAs with changes to two bases.


A Electrophoretic mobility shift assays show the in vitro binding for the wild-type PUF domain, the QFR/VFQ PUF variant, and the design 2 variant to the wild-type RNA sequence or an RNA containing mutated bases. Results are representative of two biological replicates. B Yeast three-hybrid assays show the binding of the wild-type PUF domain and QFR/VFQ variant. The negative control is a wild-type PUF domain that has stop codons in the TRM locations (repeat 1) paired with wild-type RNA. SC-L, synthetic complete media minus leucine; SC-L-H, synthetic complete media minus leucine and histidine. C Spot dilution plate assay indicates the binding of the design 1 and design 2 PUF domains to their target RNAs. The negative control is as in (B). The starting OD600 that was spotted was 0.05, with three sequential 10-fold serial dilutions shown.
We generated and purified a PUF variant (designated QFR/VFQ) with QFR (specific to C) in repeat 1 and VFQ (specific to A) in repeat 2. QFR is a newly identified TRM code for binding to C, with the QFR-C pair highly enriched and specific in our screens with TRM mutations in repeats 1 and 3; similarly, the VFQ-A pair was enriched and specific for repeat 2 (Figs. 3, 4). We cloned the QFR/VFQ variant and tested its binding via a yeast three-hybrid spotting assay, finding that it bound to its target in this assay similarly to the wild-type PUF domain binding to the wild-type RNA sequence (Fig. 7B). In the electrophoretic mobility shift assay, the QFR/VFQ variant bound to its target RNA beginning at the 15.6 nM concentration, with an estimated Kd of ~110 nM (95% CI 70–177 nM). While this binding is somewhat weaker compared to the value we determined for the pair of the wild-type PUF domain with the wild-type RNA sequence, it was completely sequence specific: the wild-type PUF domain did not bind to the variant-specific target RNA, and the QFR/VFQ PUF variant did not bind to the wild-type RNA (Fig. 7A, middle panel). These results validate that the TRM combinations identified in the three-hybrid assay reflect changes in binding specificity that can be detected biochemically in gel shift experiments.
We purified the design 1 variant (with changes to two TRM combinations but only one change in RNA base recognition) and design 2 variant (with both two TRM combinations changed and two RNA base changes in the target) in order to test their in vitro binding in the electrophoretic mobility shift assay. Design 2 showed sequence specificity, binding to its target RNA sequence but not to the wild-type RNA sequence. Specific binding occurred with an approximate Kd for this pair of ~45 nM (95% CI 33–62 nM). Heterogeneity of the shifted bands may indicate dissociation during electrophoresis or non-specific interactions. Design 1 did not bind to either its target or the wild-type RNA sequence in the gel shift assay. Both the design 1 and design 2 PUF variant proteins showed binding in a yeast three-hybrid plate assay, although the results of the dilutions indicated that the design 1 variant had about a 10-fold weaker signal than the design 2 variant (Fig. 7C). For design 1, the yeast three-hybrid assay may be more sensitive to identify a protein-RNA interaction than an in vitro binding assay.
We attempted to purify PUF variants with three or more mutated repeats (designs 8, 15, and 16), but these proteins, containing nine or more altered residues, were insoluble. Others have also observed that variant PUF proteins have been difficult to obtain in soluble form from E. coli30,47,48. For example, Cheong and Hall30 were unable to produce the soluble protein of human Pumilio-1 when they mutated the residue in repeat 7 or 3 that forms a stacking interaction with the base. Though future studies would be needed to examine the correlation between the three-hybrid selections and affinity, the values we determined in the biochemical assays generally corroborated behavior in the yeast selections.

