Simpler Fits
Simple_fit1
Replacing t43, t44, t45 with an interpolated torsion parameter for the smarts pattern: “[*:1]~[#6X3:2]~[#6X3:3]~[*:4]” and fitting this FF to 170 targets from 14 datasets (listed below under fit0) that have a dihedral that matches this pattern. The objective function value is compared to the zeroth iteration of various other FFs and fits:
FF | X2 (obj. fn value) |
---|---|
simple_fit1 | 1.656500e+02 |
openff_unconstrained-1.3.0 | 1.97777e+02 |
openff_unconstrained-1.2.0 | 2.13936e+02 |
fit4 | 1.71601e+02 |
fit4.1 | 1.83293e+02 |
Fit7
With slight changes to Fit4, TIG0 is converted to an interpolated parameter, and TIG1a and TIG1b are removed. Number of targets 928. Starting parameters are from simple_fit1 and fit4 values. Here are the parameters optimized:
['TIG0', 'TIG1c', 'TIG1d', 'TIG2', 'TIG3', 'TIG4', 'TIG5a', 'TIG5b', 'TIG6', 'TIG7', 'TIG8'] - interpolated
FF | X2 (obj. fn value) |
---|---|
Fit7 | 1.547049e+03 |
Fit4 (zeroth iter) | 1.65230e+03 |
simple_fit1 (zeroth iter) | 1.96850e+03 |
openff_1.3.0 | 2.00101e+03 |
Fit4: Parameters optimized
['TIG0', 'TIG1a', 'TIG1b'] - General torsions
['TIG1c', 'TIG1d', 'TIG2', 'TIG3', 'TIG4', 'TIG5a', 'TIG5b', 'TIG6', 'TIG7', 'TIG8'] - interpolated
On the training set the objective function values are:
Obj. function value | |
---|---|
Fit 4 (interpolated) | 881.18 |
Fit 4.1 (non-interpolated) | 899.60 |
Openff-1.3.0 (iteration 0) | 1292.62 |
Fit 7 (iteration 0) | 909.87 |
Fit4.1: For each of the interpolated parameter a general torsion parameter is created where the central bond can be a single, aromatic or double bond (denoted by letters p,q,r at the end of parameter id). Due to lack of enough training data that match those patterns only a subset of those are trained and here are the parameters optimized.
Parameters optimized
['TIG0', ‘TIG1a', ‘TIG1b',
‘TIG3p’, ‘TIG3r', ‘TIG4p', ‘TIG5ap’, ‘TIG5bp’, ‘TIG1cp’, ‘TIG6p’, ‘TIG7p’, ‘TIG8p’, ‘TIG2p’, 'TIG2r’, 'TIG1dp’ ] - General torsions
Fit 0
Input FF:
View file | ||
---|---|---|
|
...
['TIG0', 'TIG1a', 'TIG1b'] - General torsions
['TIG1c', 'TIG1d', 'TIG2', 'TIG3', 'TIG4', 'TIG5a', 'TIG5b', 'TIG6', 'TIG7', 'TIG8'] - interpolated
Targets:
'Fragment Stability Benchmark'
'OpenFF Gen 2 Torsion Set 1 Roche 2'
'OpenFF Gen 2 Torsion Set 2 Coverage 2'
'OpenFF Gen 2 Torsion Set 3 Pfizer Discrepancy 2'
'OpenFF Gen 2 Torsion Set 4 eMolecules Discrepancy 2'
'OpenFF Gen 2 Torsion Set 5 Bayer 2'
'OpenFF Gen 2 Torsion Set 6 Supplemental 2'
'OpenFF Group1 Torsions'
'OpenFF Group1 Torsions 2'
'OpenFF Group1 Torsions 3'
'OpenFF Rowley Biaryl v1.0'
'OpenFF Substituted Phenyl Set 1'
'OpenFF-benchmark-ligand-fragments-v1.0'
'SMIRNOFF Coverage Torsion Set 1'
...
Fit0 is better than 1.3.0 from the objective function values in the above table. Among CN and CC central bonds, CN has a lower objective function value and thus effect of CC is more dominant on the overall objective function.
Comparing MM Fits 0, 3 and 1.3.0 with QM
Fit 0 with all the TIG* parameters, and fit 3 is the non-interpolated version i.e.
Some of the better looking TD curves are:
...
SMILES
...
QM Vs MM
...
Structure
...
COc1cccnc1-n1cccn1
...
...
...
CC(=O)Nc1cccs1
...
...
...
CN(C)c1ccccc1-c1ccccn1
...
...
Comparing Fits 0, 3 and 1.3.0, interpolated TIG params split into single, double and aromatic terms, compared with 1.3.0_unconstrained, and QM data.
Comparison is done on the training set of molecules, removing the ones with in-ring torsions and sorting the table based on the average of absolute difference in conformer energies between QM and MM_fit0. A full list of molecules sorted in ascending order of (QM - MM_fit0) can be seen at https://github.com/MobleyLab/wbointerpolation/blob/main/compare_forcefields.ipynb
Here is a list of top 5 molecules that are in very good agreement with the QM energies for the fit0 interpolated parameters FF:
Torsion ID | Avg. abs(QM - MM_fit0) kcal/mol | Avg. abs(QM - MM_fit3) kcal/mol | Avg. abs(QM - MM_1.3.0) kcal/mol | Chemical Structure | QM-MM relative energies | |
491 | {'tid': '1762178', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't47'}} | 0.023835 | 0.405296 | 0.866851 | ||
6 | {'tid': '21272427', 'assigned_params': {'fit0': 'TIG4', 'fit3': 'TIG4p', 'openff_unconstrained-1.3.0': 't43'}} | 0.051240 | 0.397015 | 0.125842 | ||
76 | {'tid': '21272438', 'assigned_params': {'fit0': 'TIG5b', 'fit3': 'TIG5bp', 'openff_unconstrained-1.3.0': 't43'}} | 0.062916 | 0.274251 | 0.597345 | ||
628 | {'tid': '21272422', 'assigned_params': {'fit0': 'TIG5b', 'fit3': 'TIG5bp', 'openff_unconstrained-1.3.0': 't43'}} | 0.070763 | 9.416926 | 0.761913 | ||
626 | {'tid': '21540566', 'assigned_params': {'fit0': 'TIG4', 'fit3': 'TIG4p', 'openff_unconstrained-1.3.0': 't43'}} | 0.075898 | 0.410239 | 0.109622 |
Here is a list of last 5 molecules that have a higher difference in averaged MM energy with fit0 compared to QM:
Torsion ID | Avg. abs(QM - MM_fit0) kcal/mol | Avg. abs(QM - MM_fit3) kcal/mol | Avg. abs(QM - MM_1.3.0) kcal/mol | Chemical Structure | QM-MM relative energies | |
---|---|---|---|---|---|---|
573 | {'tid': '2703638', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't48'}} | 5.264862 | 4.419863 | 4.908392 | ||
121 | {'tid': '2703078', 'assigned_params': {'fit0': 'TIG2', 'fit3': 'TIG2r', 'openff_unconstrained-1.3.0': 't77'}} | 5.694254 | 6.126509 | 6.142370 | ||
832 | {'tid': '4269709', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't43'}} | 6.086503 | 8.061331 | 6.024023 | ||
812 | {'tid': '21272420', 'assigned_params': {'fit0': 'TIG4', 'fit3': 'TIG4p', 'openff_unconstrained-1.3.0': 't47'}} | 6.263980 | 6.102699 | 6.772570 | ||
532 | {'tid': '19953581', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't43'}} | 6.369817 | 7.591619 | 5.529416 |