Currently, our Sage force fields do not incorporate energetics into the fitting targets, except in the form of torsion drives. We wanted to explore the effect of fitting primarily to ab initio energy targets. Here we fit both relative conformer energies and forces.
Dataset preparation
To start with, I converted the Sage optimized geometry training set to ab initio targets.
...
Additionally, ForceBalance filters out conformers that are too high-energy during the fitting process. If there are any molecules that have only one conformer that is below this cutoff and you are using the default FB weighting, these molecules need to be treated as single-conformer molecules and either filtered out or set to fit forces only. Using our dataset, this led to 2 additional molecules being filtered out (though results are not reported).
Fitting
The bulk of this project involved determining the correct input parameters to fit the targets.
Initially the fit was started from the Sage 2.2 Force Balance input file, with weights of 1.0 for energy and 0.01 for forces, with force_rms_override 10
and energy_rms_override 1
. The ab initio fitting targets have very different magnitudes and balances than the optimized geometry targets, so this yielded unphysical results with enormous changes compared to the starting value.
In an attempt to reduce the variation compared to the starting values of the force field parameters, we increased the penalty term to penalize large deviations. This led to smaller deviations from the starting values, but did not address the underlying problem.
Next, I attempted to balance the contribution of the different targets by adjusting the weights and energy_rms_override
and force_rms_override
keywords, reducing the overall penalty, and decreasing the prior on angle and bond k’s which were deviating the most from the starting values. The guiding principle was to choose values of the priors that would keep the mvals
for each step below ~0.5, keep the objective function contribution for each target between 1-10, and balance the contribution to each target from the energy and force. The bond and angle k priors were set to 10, and the parameters chosen were
Code Block |
---|
w_energy 1.0
w_force 1.0
energy_rms_override 20.0
force_rms_override 100.0 |
This led to all the targets being between ~0.1 - 3. More tuning could be done to achieve the 1-10 range.
Results
Tuning the penalty function
Force Balance’s penalty function is:
Penalty = (1 + X)L + A
Where
A = 1
L = loss function
X = 0.5, 0.25, 0.1
By tuning X to be larger, we can keep the result closer to the MSM starting point. Using X = 0.5 gives parameters that have barely changed from the MSM starting value.
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
Unfortunately, all values of X in the penalty function lead to degradation in performance across all 3 benchmarking metrics, even ddE.
Balancing the target weights
Balancing the target weights as described above is a more physically motivated way to approach the problem. Unfortunately, it gave even worse performance than changing the penalty function.
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
Discussion/conclusion
While degradation is expected on the structural metrics since we are no longer fitting geometries, we would expect the ddE metric to improve. Because it consistently performs worse than Sage, we tabled the project.
Future work
Given the lackluster results for even the ddE targets, we decided not to move forward with this direction. However, other considerations that could lead to improvements would be:
Tuning the weights to balance the targets better
Setting single-conformer molecules to fit forces only, rather than filtering them out
Using the QM minimum geometry and some off-equilibrium geometries, rather than other minimum-energy conformers, as this would lead to much larger/easier to fit energy differences and represent more of the PES
Ensure there is a large enough training set with more chemical diversity. Our training set has some molecules with 10 conformers, and others with 1 or 2. This will over-weight molecules/moieties that have more conformers.
Fit energies and forces in conjunction with other targets