Sage differences between FB 1.9.3, FB 1.9.5, and 1.9.6

 

Background

As our software stack updates, our goal is to update from the previous software stack for fitting (ForceBalance 1.9.3 + OpenFF Toolkit 0.10.x) to versions that support Interchange (ForceBalance 1.9.5 + OpenFF 0.13.x, to begin with). This page documents differences between fits using these two stack and diagnoses the causes. Please see Reproducibility for more information.

All files were uploaded to . As this is not intended to be a public work, however, documentation is minimal.

Summary

I attempted to figure out the reasons for all differences between FB 1.9.3, FB 1.9.5, and FB 1.9.6. While I refer to these environments under those names throughout this investigation, they are shorthand for the environments described in the section below, and most differences originate outside ForceBalance. These differences can be summed up below:

  • The biggest and most influential change was Difference 1: a bug where Interchange’s charge caching was mis-applying charges to subsequent conformers of the same molecule when the atom order changed.

  • In Difference 2, the geometries optimize to the same or very similar structures, but the objective functions are very different. This suggests that we could improve the ForceBalance target by using the best RMSD measure instead of just RMSD.

  • In Difference 3, re-ordering the forces in OpenMM systems can result in surprisingly large changes in geometry. From the OpenFF toolkit 0.10.x to Interchange we changed the order in which forces are output.

  • In Difference 4, differences in precision explain the remaining differences between the FB 1.9.3 and FB 1.9.5 stacks.

  • In Difference 5, all remaining differences between 1.9.5 and 1.9.6 can be explained by a minor change in tolerance in the ForceBalance geometry optimization.

Investigation

This next section below is not comprehensive in detail, and is partly a working document I used to keep track of what other differences to look at. This is meant to be largely a one-off investigation, although it suggests that having a suite of fitting comparisons for each new software stack may be a good idea. Likely this will be most use in the future in copying Python scripts and looking at individual targets for an idea of what their objective terms used to be.

Details

The inputs used are those available at .

The environments used are specified in the attached input files. For each zipped experiment, the full environment is included.

Environments:

  • fb-193: fb-193-tk-010-oe-2022 environment

    • ForceBalance 1.9.3

    • OpenFF Toolkit 0.10.4

  • fb-193-reordered: fb-193-tk-010-oe-2022-reordered environment

    • The Toolkit code was edited to output forces in the same order as Interchange

    • create_openmm_systems was edited here:

    • return_topology = kwargs.pop("return_topology", False) # 2023-11-22 edit original = self._parameter_handlers order = [ 'vdW', 'Electrostatics', 'LibraryCharges', 'ToolkitAM1BCC', 'ProperTorsions', 'ImproperTorsions', 'Angles', 'Bonds', 'Constraints' ] self._parameter_handlers = {} for key in order: self._parameter_handlers[key] = original[key] for key in original: if key not in order: self._parameter_handlers[key] = original[key] # end 2023-11-22 # Make a deep copy of the topology so we don't accidentally modify it topology = copy.deepcopy(topology)
  • fb-195: fb-195-tk-013-oe-2022-interchange-replace-cache environment

    • Interchange 0.3.10, edited:

      • to modify the charge caching

      • to turn off the switching function, as this was a systematic difference between environments ( )

  • fb-195-switching: basically the fb-195-tk-013-oe-2022-interchange-replace-cache environment

    • Interchange 0.3.10, edited:

      • to modify the charge caching

      • to turn switching back on

  • fb-196: fb-196-ic-0318-oe-2022-crit environment

    • ForceBalance 1.9.6

    • Interchange 0.3.18

  • fb-196-crit: fb-196-ic-0318-oe-2022-crit environment

    • ForceBalance 1.9.6, with crit modified back to earlier 1e-4 version.

    • Everything else the same as fb-196

 

ForceBalance 1.9.3 vs 1.9.5

Difference 1: Switching to Interchange with cached atom charges

The vast majority of differences originated from the opt-geo-target fits and were because of this bug:

Running a single-point calculation of the entire target set brought the objective function differences down substantially:

ForceBalance

OpenFF Toolkit

Interchange

Total objective function

Time for calculation

ForceBalance

OpenFF Toolkit

Interchange

Total objective function

Time for calculation

1.9.3

0.10.4

N/A

10926.15026275

1462.2 s

1.9.5

0.13.0

0.3.10 + charge caching modification

10921.48679230

920.6 s

1.9.6

0.14.5

0.3.18

10909.6117273

2915.5 s

1.9.6 (crit switched to 1.9.5 version)

0.14.5

0.3.18

10921.48679230

963.2 s

Remaining differences after Difference 1

Investigating individual targets highlighted these differences, still all geometry optimization targets:

Batch

Target name

Term_193

Term_195

Diff

opt-geo-batch-109

19095414-4

973.288

678.565

-294.723

opt-geo-batch-113

19095588-8

800.755

488.803

-311.952

opt-geo-batch-123

19095890-3

432.754

424.808

-7.946

opt-geo-batch-136

110312530-8

796.962

1886.142

1089.18

opt-geo-batch-19

18434757-17

1352.305

1744.044

391.739

opt-geo-batch-20

18434815-6

83.565

126.754

43.189

opt-geo-batch-20

18434816-9

71.022

169.281

98.259

opt-geo-batch-48

18437974-17

2097.572

442.121

-1655.451

opt-geo-batch-71

19094129-20

470.715

117.418

-353.297

opt-geo-batch-78

19094395-8

327.768

862.083

534.315

opt-geo-batch-83

95602606-12

352.848

609.256

256.408

opt-geo-batch-83

19094564-15

662.573

411.628

-250.945

(also attached: , where single_point_all compares all targets, and single_point_differences highlights all differences over 1. These CSVs also include the associated SMILES which were too long to paste here).

Pasted below is the RMSD comparisons of all targets with different objective functions. The first three values columns show atom-to-atom RMSD (via MDAnalysis’s rmsd); the last three atoms show symmetry-corrected RMSDs (via rdkit's AllChem.GetBestRMS).

Opt batch

Target

1.9.3-QM

1.9.5-QM

1.9.3-1.9.5

1.9.3-QM Best

1.9.5-QM Best

1.9.3–1.9.5 Best

Opt batch

Target

1.9.3-QM

1.9.5-QM

1.9.3-1.9.5

1.9.3-QM Best

1.9.5-QM Best

1.9.3–1.9.5 Best

48

18437974-17

0.9

0.75

0.55

0.69

0.69

0

71

19094129-20

1.31

0.91

0.79

1.26

0.91

0.7

113

19095588-8

3.02

1.85

2.25

2.98

1.83

2.18

109

19095414-4

1.64

1.12

1.19

1.64

1.11

1.18

83

19094564-15

1.69

1.69

0.21

1.68

1.69

0.21

123

19095890-3

2.26

2.26

0.27

2.26

2.26

0.27

20

18434815-6

0.96

0.99

0.26

0.96

0.99

0.26

20

18434816-9

0.96

0.97

0.26

0.96

0.97

0.26

83

95602606-12

1.56

1.55

0.22

1.56

1.55

0.22

19

18434757-17

1.73

1.78

0.31

1.67

1.68

0.3

78

19094395-8

4

4

0.37

4

4

0

136

110312530-8

1.09

1.13

0.69

1.09

1.09

0

Difference 2: Different interpretations of internal coordinate RMSDs

The yellow rows highlighted apparently optimize to structures with symmetry-corrected RMSDs of 0, but with atom-wise RMSD values that are somewhat high. I did not look into this further as the objectives weren’t systematically better or worse. However, this suggests that the FB target should do its best to look for the best symmetry-corrected RMSD and do its best.

Difference 3: Section ordering

In several cases, re-ordering the output of the XML changed the optimization and objective function substantially. These are highlighted in blue above.

Case study: opt-geo-batch-113: 19095588-8

  • Wrote intermediate OpenMM XMLs out of the OpenMM system (by modifying ForceBalance to do so)

  • Running through the ForceBalance target, the OpenMM XML written out by FB 1.9.3 minimizes to the 1.9.3 structure, in the 1.9.5 environment; the XML written by FB 1.9.5/Interchange minimizes to the 1.9.5 structure, in the 1.9.3 environment

  • In Interchange, the forces are ordered: NBForce, Torsions, Angles, Bonds

  • In 1.9.3, the forces are ordered: Torsions, NBForce, Bonds, Angles

  • I can get the 1.9.3 environment to reproduce the 1.9.5 objective by reordering the OpenMM system creation using the Interchange order, i.e. using the fb-193-reordered environment to write forces out like Interchange.

 

Comparing remaining differences with reordering

I have highlighted in gray the differences between these four systems where the more modern stacks (FB 1.9.5, 1.9.6) align with either fb-193 or fb-193-reordered. As the order of the forces is the only difference between the 193 stacks, to me this implies that these systems are prone to different optimization pathways given the slightest difference, so they weren’t investigated further.

Batch

Target

fb-193

fb-193-reordered

fb-195

fb-196

Batch

Target

fb-193

fb-193-reordered

fb-195

fb-196

108

19095393-10

85.713

85.713

85.713

46.043

109

19095414-4

973.288

678.567

678.565

678.593

113

19095588-8

800.755

488.803

488.803

488.806

115

19095648-2

241.176

592.293

241.176

241.209

123

19095890-3

432.754

859.172

424.808

424.811

125

19095964-29

557.615

557.896

557.836

527.51

127

95602811-22

828.404

371.009

828.402

828.471

131

95602486-3

169.287

395.607

169.287

169.288

132

110312272-7

44.947

44.947

44.947

19.051

132

110312279-12

19.782

19.782

19.782

11.507

134

110312364-2

35.291

35.291

35.291

24.732

134

110312437-22

536.681

831.782

536.684

541.105

136

110312530-8

796.962

1886.145

1886.142

1885.838

142

1760118-25

32.291

32.291

32.291

5.104

142

1760122-29

37.563

37.565

37.563

10.582

158

1760623-13

52.588

52.588

52.588

10.515

160

1760689-13

10.233

10.233

10.233

2.921

162

1760764-24

14.742

14.742

14.742

6.441

162

1760765-26

16.341

16.341

16.341

4.872

165

1760859-25

184.378

184.378

184.378

8.331

166

1760895-28

16.898

16.898

16.898

5.847

181

36784452-26

85.997

85.997

85.997

43.28

19

18434757-17

1352.305

2589.068

1744.044

1743.914

20

18434815-6

83.565

83.565

126.754

126.789

20

18434816-9

71.022

71.022

169.281

169.253

21

18434921-7

1805.074

1389.509

1805.075

1805.138

28

18435721-28

333.911

333.91

333.91

320.96

36

18436551-7

20.981

20.981

20.981

9.047

48

18437974-17

2097.572

442.1

442.121

441.091

5

18433236-22

654.976

819.474

654.976

654.977

51

18438135-12

794.575

1409.209

794.575

794.554

51

18438149-18

100.882

108.452

100.882

100.882

58

18438954-4

661.058

661.059

661.06

9.408

64

18439704-15

17.706

17.706

17.706

7.98

71

19094129-20

470.715

470.715

117.418

117.415

75

19094250-10

480.939

150.779

480.94

480.97

78

19094395-8

327.768

327.768

862.083

862.083

79

19094430-15

119.851

119.701

120.066

101.136

79

19094441-17

84.345

84.096

84.232

69.599

83

19094564-15

662.573

411.628

411.628

411.699

83

95602606-12

352.848

609.252

609.256

609.277

91

19094843-7

523.362

1019.271

523.364

523.564

96

19095011-20

1561.274

2592.095

1561.278

1561.667

96

19095014-23

239.718

1177.13

239.723

239.679

 

Determining thresholds for investigation

It is common for structures to optimize to very slightly different geometries, especially due to differences in precision. I found the following trends comparing the 1.9.3 environment with 1.9.5:

Difference in objective term

RMSD difference

Best RMSD difference

Difference in objective term

RMSD difference

Best RMSD difference

0.000-0.100

8e-08 to 6e-04

0e+00 to 6e-04

0.100-0.200

6e-05 to 8e-04

6e-05 to 8e-04

0.200-0.300

5e-04 to 8e-04

5e-04 to 8e-04

0.300-0.400

4e-04 to 4e-04

4e-04 to 4e-04

7.500-7.600

1e-01 to 1e-01

1e-01 to 1e-01

 

ForceBalance 1.9.5 remaining differences

Batch

Target

fb-193

fb-193-reordered

fb-195

fb-196

Batch

Target

fb-193

fb-193-reordered

fb-195

fb-196

123

19095890-3

432.754

859.172

424.808

424.811

19

18434757-17

1352.305

2589.068

1744.044

1743.914

20

18434815-6

83.565

83.565

126.754

126.789

20

18434816-9

71.022

71.022

169.281

169.253

71

19094129-20

470.715

470.715

117.418

117.415

78

19094395-8

327.768

327.768

862.083

862.083

Difference 4: differences in precision

This difference, as detailed in the case study below, originates from minor errors in precision between the systems. Every difference in the above remaining table is explained by this difference. An example of the differences is on GitHub at . This file compares the case study system below (opt-geo-batch-20, 18434815-6) generated from fb-193-reordered and compared to fb-196. (fb-195 and fb-196 generate identical systems; in the 1.9.6 diff, however, I had modified the switching function manually to be turned off for another experiment).

Case study: opt-geo-batch-20, 18434815-6

  • The main differences between the 1.9.3 and 1.9.5 systems here are again at the level of precision

  • Minimizing the 1.9.5 system in the 1.9.3 environments, minimized to the 1.9.5 geometry

  • Minimizing the 1.9.3 systems in the 1.9.5 environment, minimized to the 1.9.3 geometry

Broad diagnosis:

  • I copied the ForceBalance opt-geo target to run-through one-by-one, and minimized each system (as output by an edited version of ForceBalance) in each environment.

Scripts:

  • Python:

  • Submission scripts:

  • Outputs:

 

ForceBalance 1.9.6 remaining differences

Batch

Target

fb-193

fb-193-reordered

fb-195

fb-196

fb-196-crit

Batch

Target

fb-193

fb-193-reordered

fb-195

fb-196

fb-196-crit

108

19095393-10

85.713

85.713

85.713

46.043

85.713

125

19095964-29

557.615

557.896

557.836

527.51

557.836

127

95602811-22

828.404

371.009

828.402

828.471

828.402

132

110312272-7

44.947

44.947

44.947

19.051

44.947

132

110312279-12

19.782

19.782

19.782

11.507

19.782

134

110312364-2

35.291

35.291

35.291

24.732

35.291

134

110312437-22

536.681

831.782

536.684

541.105

536.684

136

110312530-8

796.962

1886.145

1886.142

1885.838

1886.142

142

1760118-25

32.291

32.291

32.291

5.104

32.291

142

1760122-29

37.563

37.565

37.563

10.582

37.563

158

1760623-13

52.588

52.588

52.588

10.515

52.588

160

1760689-13

10.233

10.233

10.233

2.921

10.233

162

1760764-24

14.742

14.742

14.742

6.441

14.742

162

1760765-26

16.341

16.341

16.341

4.872

16.341

165

1760859-25

184.378

184.378

184.378

8.331

184.378

166

1760895-28

16.898

16.898

16.898

5.847

16.898

181

36784452-26

85.997

85.997

85.997

43.28

85.997

19

18434757-17

1352.305

2589.068

1744.044

1743.914

1744.044

20

18434815-6

83.565

83.565

126.754

126.789

126.754

20

18434816-9

71.022

71.022

169.281

169.253

169.281

21

18434921-7

1805.074

1389.509

1805.075

1805.138

1805.075

28

18435721-28

333.911

333.91

333.91

320.96

333.91

36

18436551-7

20.981

20.981

20.981

9.047

20.981

48

18437974-17

2097.572

442.1

442.121

441.091

442.121

5

18433236-22

654.976

819.474

654.976

654.977

654.976

51

18438135-12

794.575

1409.209

794.575

794.554

794.575

58

18438954-4

661.058

661.059

661.06

9.408

661.06

64

18439704-15

17.706

17.706

17.706

7.98

17.706

75

19094250-10

480.939

150.779

480.94

480.97

480.94

79

19094430-15

119.851

119.701

120.066

101.136

120.066

79

19094441-17

84.345

84.096

84.232

69.599

84.232

83

19094564-15

662.573

411.628

411.628

411.699

411.628

83

95602606-12

352.848

609.252

609.256

609.277

609.256

91

19094843-7

523.362

1019.271

523.364

523.564

523.364

96

19095011-20

1561.274

2592.095

1561.278

1561.667

1561.278

96

19095014-23

239.718

1177.13

239.723

239.679

239.723

Difference 5: optimization energy tolerance

In ForceBalance 1.9.6, the energy tolerance in geometry optimization was tweaked from a default of 1e-4 to 1e-2 :

# The unit of force is small, so using the same criterion as for energy minimization can lead to convergence failure. if 'force' in LocalEnergyMinimizer.minimize.__doc__: crit = max(crit, 1e-2) else: crit = max(crit, 1e-8)

Reverting this change (in fb-196-crit) gave back all the 1.9.5 results.

Case study: opt-geo-batch-58, 18438954-4

  • Optimizing the 1.9.5 system in 1.9.6 (with 1.9.5 energy tolerance) re-optimized to the 1.9.5 geometry

  • Optimizing the 1.9.5 system in 1.9.6 (with 1.9.6 energy tolerance) re-optimized to the 1.9.6 geometry

  • Optimizing the 1.9.6 system in 1.9.5 (with 1.9.5 energy tolerance) re-optimized to the 1.9.5 geometry

  • Optimizing the 1.9.6 system in 1.9.5 (with 1.9.6 energy tolerance) re-optimized to the 1.9.6 geometry

  • This change covered all cases in the 1.9.6 table above.