Cloning, expression, and purification of MS2 VLPs
The MS coat protein (CP) gene was designed as a tandem dimer, with the second part of the dimer bearing a KpnI restriction site allowing for foreign insertions, as reported previously39,40. A synthetic gene coding for MS2 CP dimer with Spytag insertion (Supplementary Methods) cloned into pET28a was purchased from BioCat GmbH (Germany). Variants with longer linkers flanking the Spytag insertion and the random peptide insertion were generated by PCR using appropriate primers (Supplementary Table 1) and then subcloned to pET28a harboring the MS2 CP dimer gene, using the KpnI restriction site.
For expression, E. coli BL21(DE3) cells were transformed with appropriate plasmids and grown with shaking at 37 °C until OD600 = 0.6, induced with 1 M IPTG and then further shaken at 18 °C for 16 h. Cells were harvested by centrifugation, resuspended in 50 mM Tris-HCl, pH 7.9, 50 mM NaCl, 5 mM MgCl2, 5 mM CaCl2 and lysed by sonication at 4 °C in the presence of protease inhibitors (Thermo Scientific). Lysates were clarified by centrifugation and Viscolase (10,000 U/1 L culture; AA Biotechnology, Poland) was added to supernatant fraction, incubated for 20 min at 37 °C, followed by 10 min incubation at 50 °C. The supernatant fraction was then centrifuged again and mixed 1:1 (V/V) with 3.7 M (NH4)2SO4, and the reaction was incubated overnight at 4 °C. Precipitated proteins were harvested by centrifugation for 10 min, 11,000 × g, at 4 °C and resuspended in PBS. The solution was then filtered through 0.2 µm membrane filters (VWR) and passed through an Amicon MWCO 100 kDa (Millipore) filtering device, in order to remove residual (NH4)2SO4 and low molecular mass proteins. Protein concentration was adjusted to 2.5–5 mg mL−1, as measured by Nanodrop (A280) and SEC—purified in PBS buffer, using Superose 6 Increase column (GE Healthcare) connected to an AKTA FPLC system.
SDS-PAGE and native PAGE
MS2 VLPs variants were analyzed by electrophoresis in both denaturing and native conditions. For SDS-PAGE, samples were separated on 12% gels Tris/Glycine gels using standard Laemmli protocol, whereas for non-denaturing electrophoresis Bis-Tris gels 3-12% gradient gels were used (Life Technologies), following the manufacturer’s recommendations. A Chemidoc detector (BioRad) was used for fluorescence detection with excitation at 546 nm. Gels were stained in InstantBlue (Expedeon).
Dynamic light scattering
Dynamic light scattering (DLS) was carried out using a Zetasizer Nano ZS (Malvern). Samples of purified MS2 VLPs were diluted to 0.05 mg mL−1 (A280), 12045 × g centrifuged for 5 min, and transferred to plastic/quartz cuvette (ZEN 2112). Measurements were performed in triplicates (15 runs for each measurement). Only measurements meeting Malvern software quality criteria were used for analysis.
Transmission electron microscopy
Samples of purified MS2 VLPs were diluted to 0.05 mg/mL, centrifuged at maximum speed for 15 min, and additionally filtered through 0.1 µm membrane filters (VWR). Samples were then applied onto hydrophilized carbon-coated copper grids (STEM Co.), negatively stained with 1% uranyl acetate, and visualized using a JEOL JEM-1230 transmission electron microscope (TEM) at 80 kV.
SpyCatcher–mCherry production and interaction with MS2 VLP
The His-tagged SpyCatcher–mCherry construct was created by PCR amplification of the mCherry gene from a pACYC Duet plasmid (a kind gift from Yusuke Azuma) and its sub-cloning to pET28a harboring His-tagged Spycatcher fragment (synthetic construct, Biocat, Germany). The final construct was verified by sequencing.
E. coli BL21(DE3) cells were transformed with the above plasmid and protein expression and extraction were conducted as described above for MS2 CP. The protein was purified using Ni-NTA and following standard purification protocol. Briefly, the cell extract was incubated with agarose beads coupled with Ni2+-bound nitrilotriacetic acid (His-Pur Ni-NTA, Thermo Fisher Scientific) pre-equilibrated in 50 mM Tris, pH 7.9, 150 mM NaCl, 20 mM imidazole (Buffer A). After three washes of the resin (with Buffer A) the protein was eluted with 50 mM Tris, pH 7.9, 150 mM NaCl, 300 mM imidazole (Buffer B). Fractions containing protein of interest were pooled and passed through Sephadex 25 (Millipore) columns in order to remove imidazole. Final protein concentration was measured by Nanodrop at 280 nm wavelength.
Attachment of purified SpyCatcher–mCherry to MS2 VLP was conducted by mixing the two in a range of different molar ratios, followed by 90 min incubation at room temperature. The interaction efficiency was evaluated by SDS-PAGE whereas efficient particle decoration was confirmed using size exclusion chromatography (as described above), preceded by filtration on the Amicon column with MW cut off 100 kDa (Merck). The optimized ratio used in the presented results (Fig. 3b) was 4: 1 (SpyCatcher–mCherry: MS2 VLP).
Cryo-electron microscopy
Purified samples of MS2 VLPs at ~1 mg mL−1 concentration were flash-frozen in liquid ethane using an FEI Vitrobot (sample volume 4 µL, blot force 0, blot time 4 s) on previously glow-discharged copper grids (Quantifoil, Cu 1.2/1.3, mesh 400). All grids were imaged with a 300 kV acceleration voltage using a Titan Krios microscope armed with a Gatan K3 camera (0.86 A/px, 40 frames movies). Raw micrographs were motion corrected using WARP45 with all further steps carried out using the CryoSPARC v2.15.0 software package46. CTF values were calculated in patch mode using Patch CTF. Micrographs were accepted for particle picking when meeting a criterion of CTF fit better than 8 Å (CTF ≤ Å). All reported resolution values are a result of independent half maps analysis with gold-standard FSC criterion (FSC = 0.143). All figures containing cryo-EM maps were prepared using either UCSF Chimera47 or ChimeraX48.
MS2 VLP assembly model
Particle assembly is modeled via reaction kinetics, that encodes the interconversion between C/C and A/B dimers with forward rate f and backward rate b (Eq. 1):
$${{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}{mathop{rightleftharpoons}limits_{b}^{f}}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}$$
(1)
Assembly starts with formation of particle 5-fold axes according to the following reactions (Eqs. 2 and 3):
$$i,{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}+{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}{mathop{rightleftharpoons }limits_{{b}_{1}}^{{f}_{1}}}(i+1),{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}},,1le ile 3,$$
(2)
$$4,{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}+{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}{mathop{rightleftharpoons }limits_{{b}_{2}}^{{f}_{2}}}5,{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}},$$
(3)
where (frac{{b}_{1}}{{f}_{1}}={e}^{frac{Delta G}{{K}_{B}T}}) and KB is the Boltzmann constant, T is temperature, and (Delta G) is the binding free energy which is (-2.7,{{{{{rm{kcal}}}}}},{M}^{-1})49. For the last reaction, the binding free energy is (-5.4,{{{{{rm{kcal}}}}}},{M}^{-1}) as there are two binding sites for the fifth A/B. This is followed by the acquisition of five C/C dimers around the 5 A/B complex (Eq. 4):
$$5{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:i{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}+{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}{mathop{rightleftharpoons }limits_{{b}_{2}}^{{f}_{2}}}5{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:(i+1){{{{{rm{C}}}}}}/{{{{{rm{C}}}}}},,0le ile 4.$$
(4)
As this early assembly intermediate is shared by all particles, we assume that the first branching of the assembly pathways, resulting in the observed particle geometries, occurs at this point (cf. split 1 in Supplementary Fig. 10a, b). At this stage, we assume that A/B and C/C dimers bind with rates ({f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}) and ({f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}), respectively (({f}_{1}={10}^{3},{M}^{-1}{S}^{-1}), ({f}_{2}={f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}={f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}={10}^{6},{M}^{-1}{S}^{-1})50,51), to the 5 A/B:5 C/C intermediate. These additions are based on a tree that indicates bifurcations in the assembly pathways whenever the addition of an A/B or C/C dimer commits the intermediate to the assembly of a distinct particle type (Supplementary Fig. 10a). To move towards the formation of T = 3 particles, the intermediate 5 A/B + 5 C/C must bind to an A/B dimer, whilst recruitment of a C/C dimer will result in the formation of T = 4 particles (Supplementary Fig. 10b). This has been modeled as follows (Eqs. 5 and 6):
$$5{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{{{{{rm{5C}}}}}}/{{{{{rm{C}}}}}}+{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}{mathop{rightleftharpoons }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}}_{{b}_{2}}}6{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{{{{{rm{5C}}}}}}/{{{{{rm{C}}}}}},$$
(5)
$$5{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{{{{{rm{5C}}}}}}/{{{{{rm{C}}}}}}+{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}{mathop{rightleftharpoons }limits^{({{{{{split}}}}},1)times {{{{{f}}}}_{{{{{rm{elong}}}}}}^{{{{{{rm{cc}}}}}}}}}_{{b}_{2}}}5{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{{{{{rm{6C}}}}}}/{{{{{rm{C}}}}}}.$$
(6)
We assume that the forward rate of the second reaction is reduced by the factor split 1 (Supplementary Table 5) to reflect the fact that it is a bifurcation from the wild type (T=3) pathway. Supplementary Fig. 10b shows that in the assembly pathway of T = 3 (T = 4) particles, the intermediates 15 A/B:8 C/C (15 A/B:11 C/C) must acquire an A/B dimer to continue towards a T = 3 (T = 4) particle geometry. However, if they acquire a C/C dimer, they will continue towards the formation of D3 particles (cf. Supplementary Fig. 10b). Thus, in the model, we assume that 15 A/B:8 C/C (15 A/B:11 C/C) can bifurcate towards the formation of D3 particles by binding to a C/C dimer. Similarly, the rate of this split is reduced by the factor split 2 (split 3) as D3 particles have a lower symmetry compared with T = 3 (T = 4) particles, and we model these splits as for split 1. The assembly pathways of T = 3 and D5 particles are similar until 30 A/B:20 C/C (Supplementary Fig. 10c), where recruitment of an A/B (C/C) dimer biases particle formation towards a T = 3 (D5) particle type (cf. split 4 in Supplementary Fig. 10a). Supplementary Fig. 10d illustrates that the assembly pathways of D3-A and D3-B particles are similar until 44 A/B:27 C/C, where recruitment of A/B dimer results in the formation of D3-A particles, and that of C/C dimers in the formation of D3-B particles (cf. split 5 in Supplementary Fig. 10a, d).
In the absence of a split in the assembly tree, the transition from assembly intermediate ({n}_{1}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{1}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}) to ({n}_{2}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{2}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}) is modeled as the random binding of (({n}_{2}-{n}_{1}){{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}) and (({m}_{2}-{m}_{1}))C/C dimers according to the following matrix, modeling the successive recruitment of individual A/B and C/C dimers in an ({n}_{2}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{2}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}) array:
$$left(begin{array}{ccccccc}{n}_{1}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{1}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}} & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & {n}_{1}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:({m}_{1}+1){{{{{rm{C}}}}}}/{{{{{rm{C}}}}}} & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & cdots & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & {n}_{1}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{2}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}hfill\ {f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}downarrow & & {f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}downarrow & & cdots & &{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}downarrow \ ({n}_{1}+1){{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{1}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}} & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & ({n}_{1}+1){{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:({m}_{1}+1){{{{{rm{C}}}}}}/{{{{{rm{C}}}}}} & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & cdots & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & ({n}_{1}+1){{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{2}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}hfill\ {f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}downarrow & & {f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}downarrow & &cdots & & {f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{ab}}}}}}}downarrow \ vdots & & vdots & & vdots & & vdots \ {n}_{2}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{1}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}} & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & {n}_{2}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:({m}_{1}+1){{{{{rm{C}}}}}}/{{{{{rm{C}}}}}} & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & cdots & mathop{to }limits^{{f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}}} & {n}_{2}{{{{{rm{A}}}}}}/{{{{{rm{B}}}}}}:{m}_{2}{{{{{rm{C}}}}}}/{{{{{rm{C}}}}}}end{array}right).$$
These kinetic equations are the basis of stochastic simulations performed with the Gillespie algorithm52 implemented in Fortran.
Parameter values
We note that our model depends on five parameters, one for each split in the assembly tree, that identify the likelihood that assembly occurs along a different branch (splits rates). These parameters have been fitted with respect to data for one scenario, and then kept the same for the other scenarios in order to make the results comparable (Supplementary Table 5). The default rate for the splits is first chosen for all but the SpyTag4 scenario, as the latter leads to a much higher yield in T = 4 particles compared to the others. Splits rates are chosen to reflect the symmetry of the particles, as there are more equivalent contact points for particles with higher symmetry. Consistent with this, split 2 is the lowest, as it leads to the formation of D3 particles with the lowest symmetry on the T = 3 (wild type) branch. Split 3 is slightly larger, although it leads to D3 particles, as it occurs on the T = 4 branch of the assembly tree. Split 1 is the largest as it occurs at the start of the assembly process and can lead to T = 4 particles whose assembly intermediates offer the largest number of symmetry-equivalent positions for incoming subunits. Split 4 is smaller than split 1, as D5 is of lower symmetry than T = 4. Split 5 is slightly smaller than split 1 because it occurs at the end of the assembly process, and since D3-A contains fewer C/C dimers than D3-B52. At that point, only the conversion rate from the symmetric C/C to the asymmetric A/B dimer (f) remains a free parameter in the model, and it is identified for each scenario based on the experimentally observed outcomes in relative particles numbers (Supplementary Table 7). For the case of SpyTag4, for which the level of T = 4 particles is much higher than for the other cases, variation of f alone is not sufficient to account for the data. We note that the best fit is obtained when f is smaller than in all other cases, implying that there is a resistance of C/C dimers to convert into A/B in this case. This is likely due to the dynamic properties of the dimer as a result of the SpyTag4 insert and may also affect the C/C binding rate to the assembly intermediates. We reflect this by reducing the value of the elongation rate of C/C dimers (({f}_{{{{{{rm{elong}}}}}}}^{{{{{{rm{cc}}}}}}})). This also implies that C/C dimers are more likely to occupy positions that require less dynamic flexibility, i.e., positions with lower curvature where C/C dimers need to bend less in order to attach. Consistent with this, C/C recruitment is higher at split 1 and split 4 (Supplementary Table 5) as they lead to particles with lower curvatures, as T = 4 is larger than T = 3, and D5 has a cylindrical shape.
Elastic properties of different particle morphologies
The elastic energy per subunit for each particle type has been determined with reference to the tiling by counting dimers in equivalent positions, i.e., in groups with comparable stretching and bending, for each structure. We assume that the elastic energy for each A/B and C/C dimer in a T = 3 particle is ({varepsilon }_{0}^{T=3}) and ({varepsilon }_{1}^{T=3}), respectively, and ({varepsilon }_{0}^{T=4}) and ({varepsilon }_{1}^{T=4}) for a T = 4 particle. In D5, D3-A, and D3-B particles there are C/C dimers that are bounded by only C/C dimers (Supplementary Fig. 10c, d). As we do not have a dimer in T = 3 and T = 4 particles with this behavior, we introduce the additional elastic energy ({varepsilon }_{2}). The elastic energy per subunit (i.e. per dimer) is thus:
$$ {E}_{T=3}=frac{2}{3}{varepsilon }_{0}^{T=3}+frac{1}{3}{varepsilon }_{1}^{T=3},\ {E}_{T=4}=frac{1}{2}{varepsilon }_{0}^{T=4}+frac{1}{2}{varepsilon }_{1}^{T=4},\ {E}_{{D}{5}}=frac{2}{7}left({varepsilon }_{0}^{T=3}+{varepsilon }_{0}^{T=4}right)+frac{4}{21}left({varepsilon }_{1}^{T=3}+{varepsilon }_{1}^{T=4}right)+frac{1}{21}{varepsilon }_{2},\ {E}_{{D}{3}-A}=frac{10}{37}left({varepsilon }_{0}^{T=3}+{varepsilon }_{0}^{T=4}right)+frac{5}{37}{varepsilon }_{1}^{T=3}+frac{10}{37}{varepsilon }_{1}^{T=4}+frac{2}{37}{varepsilon }_{2},\ {E}_{{D}{3}-B}=frac{8}{39}{varepsilon }_{0}^{T=3}+frac{4}{13}{varepsilon }_{0}^{T=4}+frac{1}{13}{varepsilon }_{1}^{T=3}+frac{14}{39}{varepsilon }_{1}^{T=4}+frac{2}{39}{varepsilon }_{2}.$$
As T = 3 and T = 4 particles are similar, for simplicity we assume that in a good approximation ({varepsilon }_{0}^{T=3}={varepsilon }_{0}^{T=4}={varepsilon }_{0}) and ({varepsilon }_{1}^{T=3}={varepsilon }_{1}^{T=4}={varepsilon }_{1}). Introducing dimensionless parameters ({k}_{1}=frac{{varepsilon }_{1}}{{varepsilon }_{0}}) and ({k}_{2}=frac{{varepsilon }_{2}}{{varepsilon }_{0}}) then reduces these equations to
$$ {E}_{T=3}/{varepsilon }_{0}=frac{2}{3}+frac{1}{3}{k}_{1},\ {E}_{T=4}/{varepsilon }_{0}=frac{1}{2}+frac{1}{2}{k}_{1},\ {E}_{D5}/{varepsilon }_{0}=frac{4}{7}+frac{8}{21}{k}_{1}+frac{1}{21}{k}_{2},\ {E}_{D3-A}/{varepsilon }_{0}=frac{20}{37}+frac{15}{37}{k}_{1}+frac{2}{37}{k}_{2},\ {E}_{D3-B}/{varepsilon }_{0}=frac{20}{39}+frac{17}{39}{k}_{1}+frac{2}{39}{k}_{2}.$$
These define different areas in parameter space given the relative stretching and properties of C/C dimers in different positions. The red region in Supplementary Fig. 11 indicates the area for which ({E}_{T=3} , < ,{E}_{T=4} , < ,{E}_{D5} , < ,{E}_{D3-A} , < ,{E}_{D3-B}). It is worth noticing that in this area the difference between k2 and k1 is bigger than the difference between k1 and 1, i.e. the jump in the level of elastic energy from ({varepsilon }_{1}) to ({varepsilon }_{2}) is bigger than the jump from ({varepsilon }_{0}) to ({varepsilon }_{1}). This is consistent with the fact that the C/C dimer that is only bound to other C/C dimers (({varepsilon }_{2})) is in a flatter position. It also reflects the order in particle numbers seen in the experiment, T = 3 > T = 4 > D5 > D3-A > D3-B (Supplementary Table 7). This demonstrates that elastic properties are important for the assembly outcome and can account for the rank order of the particle types. However, in order to determine the precise values and differences for each SpyTag option, we refer to the kinetic model above.

