Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

* Note: This post contains 'preliminary valence parameter fitting results’, which was carried out with currently available QM data from 2nd generation training data sets.

Description

This post contains benchmark of preliminary valence parameter fitting (v1.2.0-preliminary). openff-1. 2.0-pre.offxml file:

...

...

nameopenff-1.2.0-pre.offxml

Results of benchmarks for both the initial force field, pre-released force fields (v1.0.0, v1.1.0) and the re-fitted force field are provided here for performance comparison.

Fitting Data and Results

  • Fitting targets: 581 1-D torsion profiles; 2,974 optimized geometries; 278 vibrational frequencies

  • Input force field : same initial force field used in v1.1.0 fitting

  • The objective function decreased from 1.02809e+04 to 3.21676e+03 in 28 steps.

...

X2 for primary(neighboring) set

X2 for full(diverse) set

Initial force field

1435

29,469

v1.0.0

948

20,672

v1.1.0

936

20,097

v1.2.0-preliminary

766

16,939

To provide more intuitive insights on the benchmark results, we aggregated the resulting data and made the following plots.

...

y values in the plots(Δ WRMSE) are the difference in the WRMSE between different v1.2.0-pre and v1.1.0; Negative y value indicates better reproduction in v1.2.0-pre compared to v1.1.0. The average change in WRMSE is -1.248, indicating that overall the v1.2.0-pre better performs in reproducing QM optimized geometry than v1.1.0.

The major contribution of significant improvement in reproducing QM optimized geometry seems to be the inclusion of eMolecules discrepancy set( a set having geometries that are substantially different in smirnoff99Frosst relative to the other force fields) in QM training set generation.

All geometries shown significant improvement with v1.2.0-pre( delta WRMSE < -30, blue-circled) are deprotonated phosphonates, RP(=O)(OH)(O^-). The input molecule sets(Roche set, Coverage set) used to generate the first generation optimization dataset for valence parameter fitting didn't have the phosphono group. And by using eMolecules discrepancy set during the second generation optimization dataset generation process, C=C(C(=O)O)OP(=O)(O)O has been added to the new dataset, which enabled to properly fit the parameter related to phosphono group.

Here’s one example of the improved performance on phsphonates.

...

QM optimized geometry of ([P@@](=O)(O)[O-])[P@](=O)(O)[O-]. ( transparent red: MM optimized geometry with v1.1.0, transparent green: v1.2.0-pre)

v1.1.0 locates hydroxyl hydrogens in the middle of hydroxyl oxygen and the negatively charged oxygen, like they form internal H-bond with the negatively charged oxygen, which doesn’t agree with the hydroxyl hydrogen location in QM optimized geometry.

2. Abinitio Targets

To investigate the improved performance of the new parameter set in reproducing QM relative energies between conformers, QM vs MM relative energies between conformers “at QM optimized geometries” were calculated.

...

The distribution from v1.2.0-pre is more centered to zero and mean absolute deviation(MAD) is smaller than v1.1.0 and v1.0.0, indicating that overall the v1.2.0-pre better performs in reproducing QM energetics than the old versions.

v1.2.0-pre force field file:

View file
nameopenff-1.2.0-pre.offxml