Fitting TIG* parameters
Simpler Fits
Simple_fit1
Replacing t43, t44, t45 with an interpolated torsion parameter for the smarts pattern: โ[*:1]~[#6X3:2]~[#6X3:3]~[*:4]โ and fitting this FF to 170 targets from 14 datasets (listed below under fit0) that have a dihedral that matches this pattern. The objective function value is compared to the zeroth iteration of various other FFs and fits:
FF | X2 (obj. fn value) |
---|---|
simple_fit1 | 1.656500e+02 |
openff_unconstrained-1.3.0 | 1.97777e+02 |
openff_unconstrained-1.2.0 | 2.13936e+02 |
fit4 | 1.71601e+02 |
fit4.1 | 1.83293e+02 |
ย
ย
Fit7
With slight changes to Fit4, TIG0 is converted to an interpolated parameter, and TIG1a and TIG1b are removed. Number of targets 928. Starting parameters are from simple_fit1 and fit4 values. Here are the parameters optimized:
['TIG0', 'TIG1c', 'TIG1d', 'TIG2', 'TIG3', 'TIG4', 'TIG5a', 'TIG5b', 'TIG6', 'TIG7', 'TIG8'] - interpolated
FF | X2 (obj. fn value) |
---|---|
Fit7 | 1.547049e+03 |
Fit4 (zeroth iter) | 1.65230e+03 |
simple_fit1 (zeroth iter) | 1.96850e+03 |
openff_1.3.0 | 2.00101e+03 |
ย
ย
Fit4: Parameters optimized
['TIG0', 'TIG1a', 'TIG1b'] - General torsions
['TIG1c', 'TIG1d', 'TIG2', 'TIG3', 'TIG4', 'TIG5a', 'TIG5b', 'TIG6', 'TIG7', 'TIG8'] - interpolated
On the training set the objective function values are:
ย | Obj. function value |
---|---|
Fit 4 (interpolated) | 881.18 |
Fit 4.1 (non-interpolated) | 899.60 |
Openff-1.3.0 (iteration 0) | 1292.62 |
Fit 7 (iteration 0) | 909.87 |
Fit4.1: For each of the interpolated parameter a general torsion parameter is created where the central bond can be a single, aromatic or double bond (denoted by letters p,q,r at the end of parameter id). Due to lack of enough training data that match those patterns only a subset of those are trained and here are the parameters optimized.
Parameters optimized
['TIG0', โTIG1a', โTIG1b',
โTIG3pโ, โTIG3r', โTIG4p', โTIG5apโ, โTIG5bpโ, โTIG1cpโ, โTIG6pโ, โTIG7pโ, โTIG8pโ, โTIG2pโ, 'TIG2rโ, 'TIG1dpโ ] - General torsions
Fit 0
Input FF:
Parameters to optimize:
['TIG0', 'TIG1a', 'TIG1b'] - General torsions
['TIG1c', 'TIG1d', 'TIG2', 'TIG3', 'TIG4', 'TIG5a', 'TIG5b', 'TIG6', 'TIG7', 'TIG8'] - interpolated
Targets:
'Fragment Stability Benchmark'
'OpenFF Gen 2 Torsion Set 1 Roche 2'
'OpenFF Gen 2 Torsion Set 2 Coverage 2'
'OpenFF Gen 2 Torsion Set 3 Pfizer Discrepancy 2'
'OpenFF Gen 2 Torsion Set 4 eMolecules Discrepancy 2'
'OpenFF Gen 2 Torsion Set 5 Bayer 2'
'OpenFF Gen 2 Torsion Set 6 Supplemental 2'
'OpenFF Group1 Torsions'
'OpenFF Group1 Torsions 2'
'OpenFF Group1 Torsions 3'
'OpenFF Rowley Biaryl v1.0'
'OpenFF Substituted Phenyl Set 1'
'OpenFF-benchmark-ligand-fragments-v1.0'
'SMIRNOFF Coverage Torsion Set 1'
Total number of targets excluding Lim Mobley benchmarks = 2746
QCA tdr_objects to exclude are in this file
Fit 1
Input FF:
Without excluding the in-ring torsions
Parameters to optimize:
['TIG0'] - General torsion
['TIG1c', 'TIG1d', 'TIG2', 'TIG3', 'TIG4', 'TIG5a', 'TIG5b', 'TIG6', 'TIG7', 'TIG8'] - interpolated
Targets: same as in Fit 0
ย
Fit 2
Breaking up the interpolated parameters into single, aromatic and double (wherever possible) bond general torsion terms. Naming these as extensions of earlier TIG parameters appended by p, q, r for single, aromatic and double bonds repsectively. Wherever a carbonyl carbon is implied on the central bond there are no central double bonds, so not all parameters will have โr' extension. Excluding the high torsion barrier filters TIG1a, 1b so that double and aromatic bonds wonโt get filtered.
Input FF:
Parameters to optimize:
['TIG0', โTIG1cp', โTIG1cq', โTIG1dp', โTIG1dq', โTIG1dr', โTIG2p', โTIG2qโ, โTIG2rโ, โTIG3pโ, โTIG3qโ, โTIG3rโ, โTIG4pโ, โTIG4qโ, โTIG4rโ, โTIG5apโ, โTIG5aqโ, โTIG5bpโ, โTIG5bqโ, โTIG5brโ, โTIG6pโ, โTIG6qโ, โTIG6rโ, โTIG7pโ, โTIG7qโ, โTIG7rโ, โTIG8pโ, 'TIG8qโ ]
Targets: same as in Fit 0
Fit 3
Corrected the phase of non-interpolated parameters (from Fit 2)
ย
Results of fits
Objective fn. | Full |
Fit 0: TIG* | 5.4766E+03 |
Fit 1: TIG* without filtering ring-torsions | 5.4807E+03 |
Fit 2: non-interpolated with 2 phases | 3.0181E+05 |
Fit 3: non-interpolated with 1 phase | 6.6621e+03 |
Chayaโs dataset only using fit0-FF | 3.2455E+02 |
OpenFF_1.3.0 (Iter 0 on TIG dataset) | 5.9620E+03 |
ย | ย |
Iter 0 with CN, or CC central bonds only | ย |
CN only TIGs [1a, 1c, 1d, 2, 6, 7, 8] + [t43, 44, 45] | 5.4844E+03 |
CC only TIGs [0, 1b, 3, 4, 5a, 5b] + [t69, 69a, 76, 77, 78] | 5.8771E+03 |
Fit0 is better than 1.3.0 from the objective function values in the above table. Among CN and CC central bonds, CN has a lower objective function value and thus effect of CC is more dominant on the overall objective function.
Comparing MM Fits 0, 3 and 1.3.0 with QM
Fit 0 with all the TIG* parameters, and fit 3 is the non-interpolated version i.e., interpolated TIG params split into single, double and aromatic terms, compared with 1.3.0_unconstrained, and QM data.
Comparison is done on the training set of molecules, removing the ones with in-ring torsions and sorting the table based on the average of absolute difference in conformer energies between QM and MM_fit0. A full list of molecules sorted in ascending order of (QM - MM_fit0) can be seen at wbointerpolation/compare_forcefields.ipynb at main ยท MobleyLab/wbointerpolation
Here is a list of top 5 molecules that are in very good agreement with the QM energies for the fit0 interpolated parameters FF:
ย
Torsion ID | Avg. abs(QM - MM_fit0) kcal/mol | Avg. abs(QM - MM_fit3) kcal/mol | Avg. abs(QM - MM_1.3.0) kcal/mol | Chemical Structure | QM-MM relative energies | |
491 | {'tid': '1762178', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't47'}} | 0.023835 | 0.405296 | 0.866851 | ||
6 | {'tid': '21272427', 'assigned_params': {'fit0': 'TIG4', 'fit3': 'TIG4p', 'openff_unconstrained-1.3.0': 't43'}} | 0.051240 | 0.397015 | 0.125842 | ||
76 | {'tid': '21272438', 'assigned_params': {'fit0': 'TIG5b', 'fit3': 'TIG5bp', 'openff_unconstrained-1.3.0': 't43'}} | 0.062916 | 0.274251 | 0.597345 | ||
628 | {'tid': '21272422', 'assigned_params': {'fit0': 'TIG5b', 'fit3': 'TIG5bp', 'openff_unconstrained-1.3.0': 't43'}} | 0.070763 | 9.416926 | 0.761913 | ||
626 | {'tid': '21540566', 'assigned_params': {'fit0': 'TIG4', 'fit3': 'TIG4p', 'openff_unconstrained-1.3.0': 't43'}} | 0.075898 | 0.410239 | 0.109622 |
ย
Here is a list of last 5 molecules that have a higher difference in averaged MM energy with fit0 compared to QM:
ย | Torsion ID | Avg. abs(QM - MM_fit0) kcal/mol | Avg. abs(QM - MM_fit3) kcal/mol | Avg. abs(QM - MM_1.3.0) kcal/mol | Chemical Structure | QM-MM relative energies |
---|---|---|---|---|---|---|
573 | {'tid': '2703638', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't48'}} | 5.264862 | 4.419863 | 4.908392 | ||
121 | {'tid': '2703078', 'assigned_params': {'fit0': 'TIG2', 'fit3': 'TIG2r', 'openff_unconstrained-1.3.0': 't77'}} | 5.694254 | 6.126509 | 6.142370 | ||
832 | {'tid': '4269709', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't43'}} | 6.086503 | 8.061331 | 6.024023 | ||
812 | {'tid': '21272420', 'assigned_params': {'fit0': 'TIG4', 'fit3': 'TIG4p', 'openff_unconstrained-1.3.0': 't47'}} | 6.263980 | 6.102699 | 6.772570 | ||
532 | {'tid': '19953581', 'assigned_params': {'fit0': 'TIG3', 'fit3': 'TIG3p', 'openff_unconstrained-1.3.0': 't43'}} | 6.369817 | 7.591619 | 5.529416 |
ย