Evaluating priors

Background

Hyesu evaluated suitable priors for use in fitting for OpenFF 1.3.0: v1.3.0 release note | 4. Determination of priors width of valence parameters

However, now that we fit to the initial guesses obtained through the MSM method, Pavan has found these priors to be unsuitable. OpenFF 2.1.0 used the following priors:

priors Angles/Angle/k : 100.0 Angles/Angle/angle : 5.0 Bonds/Bond/k : 100.0 Bonds/Bond/length : 0.1 ProperTorsions/Proper/k : 5.0 ImproperTorsions/Improper/k : 5.0 /priors

These were a bit too loose (pasted from Slack):

Trevor & I were looking at the parameter changes wrt the initial MSM starting point for 2.1.0 and the use of larger priors shifted some force constants by a larger value since the mval changes during a FB run are not penalized as much if the priors are large. Especially, the k value of a32 , the hypervalent sulfur angle, changed by 44% from 241 units to 134 units, which is almost 1*mval change with a prior of 100 units on angle force constant. So, the angle was not as stiff as the initial point. It would be counter intuitive to use larger priors if our intention is to be as close to the QM informed starting point as possible. So I tried two more iterations with tighter priors and the one I picked for 2.2.0 is a good one with no degradation in benchmarks.

He evaluated multiple sets of priors, finding the below to be too tight:

priors Angles/Angle/k : 5.0 Angles/Angle/angle : 1.0 Bonds/Bond/k : 10.0 Bonds/Bond/length : 0.01 ProperTorsions/Proper/k : 5.0 ImproperTorsions/Improper/k : 5.0 /priors

But the below set gave good results, and was used in :

priors Angles/Angle/k : 20.0 Angles/Angle/angle : 1.0 Bonds/Bond/k : 20.0 Bonds/Bond/length : 0.01 ProperTorsions/Proper/k : 5.0 ImproperTorsions/Improper/k : 5.0 /priors

Goal

Repeating Hyesu’s analysis with the new MSM-derived initial values may be worthwhile for determining the most optimal set of priors, or at least for systematic evidence that this set of priors is the best.

Results

Hyesu’s analysis used SMIRNOFF99Frosst parameters for analysis. I repeated Hyesu’s analysis with the parameters from Sage 2.1.0. I followed the same methodology. The general idea is to use a measure of the spread of initial parameters to determine what the prior (tolerance) should be during optimization. The distribution of each type of parameter is shown below, because the distributions are generally not Gaussian, I used the IQR instead of the standard deviation to characterize them.

Bonds

For bonds, I tried analyzing them all together (90 bonds total), but found in agreement with the previous result that there was a large spread due to type of bond and element composition. To try to account for this, I repeated the normalization procedure that was done previously. I sorted bond parameters by bond order and the elements involved (e.g. C-C single bond, C-N double bond, etc). For any bond type with more than 1 parameter value (e.g. there are several types of C-C single bonds), I subtracted the mean for that type from each parameter, and repeated the analysis on this dataset of normalized parameters (54 bonds total). The reduction in number of datapoints comes from removing points where there was only one type of bond (e.g. N-N triple bond), where subtracting the mean would not be meaningful.

 

Spread of bond force constants, k (kcal/mol/A2), from Sage 2.1.0. Left: all bonds, raw data. Right: only bonds where there were multiple different entries for the bond type (e.g. C-C single, C-H single), normalized by bond type.
Spread of bond lengths, in A, from Sage 2.1.0. Left: all bonds, raw data. Right: only bonds where there were multiple different entries for the bond type (e.g. C-C single, C-H single), normalized by bond type.

Parameter

Parsley 1.3.0 Prior

Sage 2.1.0 prior

Sage 2.2 prior

SMIRNOFF99Frosst StDev

SMIRNOFF99Frosst IQR

Sage 2.1.0 StDev

Sage 2.1.0 IQR

Parameter

Parsley 1.3.0 Prior

Sage 2.1.0 prior

Sage 2.2 prior

SMIRNOFF99Frosst StDev

SMIRNOFF99Frosst IQR

Sage 2.1.0 StDev

Sage 2.1.0 IQR

Bond length (all bonds) (A)

0.1

0.1

0.01

0.33

0.4

0.34

0.44

Bond length (normalized) (A)

0.1

0.1

0.01

--

0.1

0.04

0.06

Bond k (all bonds) (kcal/mol/A2)

100

100

20

249.30

338.0

552.0

517.9

Bond k (normalized) (kcal/mol/A2)

100

100

20

--

87.5

220.8

184.0

For bond lengths, the all bond dataset shows similar standard deviation and IQR to the SMIRNOFF99Frosst values. However, the normalized dataset has a lower spread, which would support a lower prior.

In both the dataset with all bonds and the normalized dataset, the spread of bond force constant parameters is much larger in Sage 2.1.0 vs in SMIRNOFF99Frosst. This makes sense as Sage 2.1.0 has more bond types that correspond to diverse chemistries, which would lead to more different values. The types of bonds with the largest IQR are C-C double bonds (2 bonds, IQR 219), C-N double bonds (2 bonds, IQR 439), C-O bonds of unspecified order (2 bonds, IQR 291), N-N double bonds (2 bonds, IQR 675), and O-P bonds of unspecified type (2 bonds, IQR 403). Excluding bonds of unspecified type only reduces the IQR to 173.

This methodology would suggest a much larger prior should be used for optimizing bond force constants, on the order of 150-200 kcal/mol/A2. Pavan’s testing showed that a lower prior was needed for force constants, suggesting that this methodology may not yield useful suggestions for priors when we start from the MSM and QM-derived values.

Angles

Following the previous analysis, for angles I analyzed all the parameters together in one dataset, not accounting for variations due to different geometries. The results are below.

Parameter

Parsley 1.3.0 Prior

Sage 2.1.0 prior

Sage 2.2 prior

SMIRNOFF99Frosst StDev

SMIRNOFF99Frosst IQR

Sage 2.1.0 StDev

Sage 2.1.0 IQR

Parameter

Parsley 1.3.0 Prior

Sage 2.1.0 prior

Sage 2.2 prior

SMIRNOFF99Frosst StDev

SMIRNOFF99Frosst IQR

Sage 2.1.0 StDev

Sage 2.1.0 IQR

Angle (deg)

10

5

1

24.24

10.5

23.1

12.5

Force constant (kcal/mol/rad2)

50

100

20

95.15

40.0

76.6

121.9

Here, the spread of the angle parameter is relatively similar between SMIRNOFF99Frosst and Sage 2.1.0, and this analysis would support use of the Parsley 1.3.0 prior, but suggest that the Sage 2.1.0 and 2.2 priors are too small. For the force constant, once again the spread is much larger than it was in SMIRNOFF99Frosst, which makes sense due to increased chemical diversity in the parameters. This analysis would suggest that the Sage 2.1.0 prior of 100 would be appropriate, but that the Sage 2.2 would be too small. This is at odds with the actual performance of the force fields, where the Sage 2.1.0 priors were found to be too loose and the Sage 2.2 were found to be more appropriate. This suggests again that using the spread to determine the priors may not be appropriate for the MSM and QM-derived starting parameters.

Proper torsions

Proper torsions require some consideration of how to analyze them, as different SMIRKs patterns have different number of torsion force constants. The previous analysis for Parsley 1.3.0 uses the sum of all force constants for each SMIRKs pattern. In addition to this, I consider all of the force constants together, regardless of if it’s k1, k2, etc, as well as just looking at k1 for each SMIRKs. Results did not strongly depend on the method of pre-processing. Results are shown below.

Parameter

Parsley 1.3.0 Prior

Sage 2.1.0 prior

Sage 2.2 prior

SMIRNOFF99Frosst StDev

SMIRNOFF99Frosst IQR

Sage 2.1.0 StDev

Sage 2.1.0 IQR

Parameter

Parsley 1.3.0 Prior

Sage 2.1.0 prior

Sage 2.2 prior

SMIRNOFF99Frosst StDev

SMIRNOFF99Frosst IQR

Sage 2.1.0 StDev

Sage 2.1.0 IQR

Sum of k’s (kcal/mol)

2

5

5

--

2.3

2.9

1.7

All k’s (kcal/mol)

2

5

5

1.32

1.2

2.3

1.1

k1 only (kcal/mol)

2

5

5

--

--

2.8

1.2

This analysis suggests that Parsley 1.3.0’s prior of 2 was appropriate, and that 5 would be too loose.

 

Conclusion

Here, I repeated Hyesu’s analysis of the SMIRNOFF99Frosst parameters for Sage 2.1.0 parameters, to determine new priors for future fitting efforts. However, large spreads in the Sage parameters and Pavan’s previous results for Sage 2.2 suggest that these would not be useful values for priors using the MSM starting point.

The priors Pavan identified for Bond and Angle parameters in Sage 2.2 seem to work well, and follow the logic that we should not allow the parameters to stray far from their QM-inspired starting point. For future efforts, I will focus on determining both starting values and prior (tolerances) for the torsion parameters, as these were not modified for Sage 2.2 and require further investigation due to the starting point not being derived from QM.