Summary of DFE procedure
The calculation of DFE of a complex is composed of the following steps. (1) A first set of independent metadynamics runs are launched. Each run shares the same starting point while using a different random seed, and typically proceeds for 10–40 ns. (2) After completion of these runs, the FES of each run (namely the primitive FES) is calculated (see Fig. 1b). (3) The averaged FES is calculated from an ensemble of primitive FES (see Fig. 1c). (4) DFE is calculated from an averaged FES. And (5) the convergence analysis is performed (see Fig. 1e). If convergence is not reached, another set of runs are launched and the previous steps are iterated; otherwise, the final DFE is obtained. After step 2, each run is examined to determine if it behaves as a one-way trip. A run that does not fit the one-way trip description is removed from the ensemble for averaging. A run that needs more simulation time to exit from BS is extended. This is called “the correction process” and will be detailed in the “Methods” section. The calculation of the primitive FES of an individual run is part of the standard metadynamics16,19,20. The calculation of DFE and the convergence analysis will be illustrated in more detail below.
The primitive FES from three of the runs of a PPC (PDB entry 3A4S) are given in Fig. 1b as examples. While their differences reflect the randomness of different runs, they all feature a major peak around the bound state region and a flatter variable curve in the free state region. The averaged FES from 50 runs for the same system is given in Fig. 1c, indicating a smoother surface, a clear bound state minimum centered at r0 and a boundary rb between the bound state region and the free state region which is a maximum or saddle point.
Calculation of DFE from averaged FES
Denoting the averaged FES using symbol g(D), a nominal partition function Q can be calculated using the following equation:
$$Q={(b-a)}^{-1}{int}_{a}^{b}text{exp}left(-frac{g(D)}{kT}right)dD$$
(1)
where a and b are the beginning and end of the bound state region, respectively; k, T and D are the Boltzmann constant, temperature and CV distance, respectively.
It is important to point out that (text{exp}left(-frac{g(D)}{kT}right)) decreases exponentially when g(D) increases. As shown in the example in Fig. 1d, this function is narrowly concentrated around r0. Only the region around r0 with energy higher than the minimum point by 4 kcal/mol or less has significant contributions to Q. Therefore, the integration can be done over the entire D region sampled, which would have no significant difference compared to integration over the bound state region. In other words, practically, a can be set to zero and b can be set to the distance where the wall is placed. (A wall is used to stop the monomers from moving too far from each other). One does not have to determine where rb is to calculate Q.
DFE is calculated from the nominal partition function as,
$$DFE= -kTtext{ln}Q$$
(2)
Convergence analysis based on DFE-N plot
DFE is calculated based on the averaged FES which depends on how many runs are used in the averaging process. To examine if DFE converges into a stable value for a given system, DFE is first calculated using only one run, and then recalculated iteratively by incorporating additional runs into the averaging. An example of a plot of DFE as a function of the number of runs (N) used in the averaging is shown in Fig. 1e. Large fluctuations in the calculated DFE are observed initially for small values of N and convergence to a stable value occurs as N is increased. When the fluctuations among the last five runs are smaller than 1 kcal/mol, the sampling is considered converged and the final DFE value is obtained.
Calculation of DFE for a diverse set of PPCs
Nineteen PPCs were selected from the Protein–Protein Interaction Database which provides a collection of crystal structures of PPCs with the corresponding experimental binding affinities21,22,23,24,25,26,27. The details of the selection are given in the “Methods” section. The DFE procedure was applied to each complex.
The derived averaged FES and convergence plots for these PPCs are given in Supplementary Fig. S1. Columns 1–4 of Table 1 summarize each PPC’s PDB code, complex type, definition of CV and DFE value obtained, respectively. Columns 6–7 give the experimental binding free energy ∆Ge and dissociation constant Kd of each PPC. The plot of ∆Ge against calculated DFE for the 19 complexes (Fig. 2a) indicates a strong linear correlation between DFE and ∆Ge, though one point (open circle) appears to be an outlier. If this point is excluded from the Least-Square-Fitting, the coefficient of determination R2 and the coefficient of correlation R are 0.84 and 0.92, respectively, indicating a very high correlation. The relationship between DFE and ∆Ge from the Least-Square-Fitting was as follows, with a standard error of 1.61 kcal/mol.


Correlation plots between calculated DFE and experimental binding free energies ∆Ge. (a) Correlation plot for 19 PPCs. The open circle is an outlier (corresponding to complex 3SGB in Table 1). The indicated R2 and SE (Standard Error) correspond to the Least-Square-Fitting after excluding the outlier. (b) Correlation plot for PLCs for target CDK2.
$$Delta {G}_{e}=0.4512 DFE-1.02$$
(3)
If the outlier is not excluded, R2, R and the standard error change to 0.74, 0.86 and 2.06, respectively, which still indicates a high correlation.
The results indicate that DFE is linearly correlated with ∆Ge, but the individual DFE values differ from the corresponding individual ∆Ge values. This is consistent with the concept that DFE is not the binding free energy by definition, but it is closely coupled with it. The major difference between the concept of DFE and that of binding free energy is that DFE quantifies the work required to dissociate a complex while binding free energy is the free energy difference between the bound and free states. DFE does not require the calculation of the free state explicitly, neither does it require the calculation of the contribution of the standard state concentration5,28. The ignored terms in DFE may be mostly the constant terms of the binding free energy. More analysis about what are missing in DFE is given in the “Discussion” section.
The above relationship between ∆Ge and DFE was used to convert the DFE values into calculated binding free energies ∆Gc (Column 5, Table 1). ∆Gc can be compared with ∆Ge individually. The errors given in the DFE column of Table 1 are due to the sampling completeness. The differences between ∆Gc and ∆Ge are typically larger than these errors because they include everything that could cause discrepancies between calculations and experimental measurements including theoretical assumptions, force-field errors and experimental errors.
Comparison with and without the correction process
The above results were obtained after applying the correction process. Table 2 gives a comparison of DFE and ∆Gc with and without the correction process (columns 5–8) as well as the total number of runs, chemical time in each run before any extension, and corrections made if any, for each PPC, respectively. Overall, about 13% of the runs could not be considered as completed one-way trips. These runs were either extended to longer simulation times or removed from the calculations according to the process detailed in the “Methods” section. Interestingly, the correction process did not change ∆Gc significantly for any of the complexes; the changes due to the correction are either close to or smaller than 0.5 kcal/mol. In terms of DFE, only three of the 19 complexes showed a difference more than 1 kcal/mol (1.2, 1.3 and 2.2) before and after the correction. The correlation indexes R2 (0.84) and R (0.91) without the correction remained practically the same as those with the correction.
Calculation of DFE for eight protein targets with different sets of ligands
To investigate the performance of the DFE procedure in predicting protein–ligand binding free energies, we used a published dataset of PLC structures with experimental binding affinities, which had previously been used to benchmark the FEP method10.
The derived averaged FES and convergence plots of all the PLCs are given in Supplementary Figs. S2 and S3. Columns 1–5 in Table 3 give the protein targets, number of ligands for each target, CV definition, number of runs and chemical time in each run, respectively. The obtained DFE were compared with experimental ∆Ge for each target separately, with the Least-Square-Fitting analysis. The derived R2 and standard error for each target are included in Columns 6–7 of Table 3. R2 varied from 0.67 for CDK2 to 0.32 for BACE. R2 for Thrombin was exceptionally low (0.01), however, exclusion of two outliers brought R2 to 0.62. The average R2 across all the eight targets was 0.45 without excluding any points and 0.53 after excluding the two outliers in Thrombin data. Interestingly, the standard error did not follow the trend of R2, but distributed flatly between 0.57 and 1.09 kcal/mol. Thrombin had the smallest standard error. These observations suggest that the prediction errors for different targets are similar and reasonably small, and that the variations of R2 among different targets are mainly due to the different binding free energy spans of the ligand sets. For example, the span for CDK2 is 4.2 kcal/mol while the span for Thrombin is 1.7, causing a higher R2 for the former than for the latter. Likewise, the R2 for these PLCs are generally smaller than the R2 for the PPCs because the latter has a much larger binding free energy span (13 kcal/mol).
The linear relationship between DFE and ∆Ge derived from Least-Square-Fitting for each target is given in Column 8, Table 3. The correlation plots are given in Fig. 2b for CDK2 and Supplementary Fig. S4 for all the targets.
The overall predictive power of the DFE method was similar to that of the FEP+ method10 for these same sets of PLCs. A comparison of the R2 and standard error between the two methods in terms of predicting experimental binding free energies is summarized in Supplementary Table S1 and Figs. S4 and S5.
Definition of appropriate CVs and assessment of variability due to the different CVs
The intermolecular distance used as the CV in the calculation of DFE of a complex needs to be defined such that its forced increase directly leads to the dissociation of the complex without significant conformational changes unrelated to the dissociation. Although it is difficult to differentiate which conformational changes are related to the dissociation and which are not, we took the following precautions in the choice of CVs: (1) we avoided using any residues from flexible loops to define a CV because the Gaussian forces acting upon such a CV could cause conformational changes of the corresponding loops which might not originate from the dissociation process. (2) The CV for a PPC was defined as the distance between the centroids of the backbone heavy atoms of a pair of residues, each from different proteins in the complex. The CV for a PLC was defined as the distance between the centroid of the heavy atoms of a residue of the protein and the centroid of the heavy atoms of the whole ligand in the complex. And (3) the protein residues used to define a CV were tightly packed within the respective proteins and were located within either an α-helix or a β-sheet of multiple strands when possible.
We used one PPC (3A4S) to explore the effect of the change of CV definition on DFE, in which we tried seven different CVs by changing the residues defining the CV. For each of the CVs, DFE was respectively calculated with and without the correction process (see Supplementary Table S2). We found that the standard deviation of DFE stemming from the changes of CV definition was about 1.08 kcal/mol without the correction process and 1.03 with the correction process. The standard error of the mean was 0.41 and 0.39 kcal/mol, respectively. The maximum difference among the set of DFE values was about 2.73 without the correction and 2.96 with the correction. Substituting these values into Eq. (3), the estimated uncertainty in the predicted ∆Gc due to the different choices of CV was less than 1.3 kcal/mol, or 0.5 based on the standard deviation of DFE. Given that the standard error of Eq. (3) was about 1.6 kcal/mol, the uncertainty due to the different choices of CV would not alter the described quality of correlation largely, and it might be one of the sources of error already included in the standard error of Eq. (3).

