TYK2 Refit Validation
After using the bespoke workflow to optimize the torsions of some of the TYK2 ligands we want to establish how much the parameters have improved the overall description of the molecule. What's more, we also want to see how well the parameters fit molecule fragments transfer back to the parent molecule. In these tests, we hope to establish how the optimization has affected the force field before running FEP calculations.
Bespoke-fit
Each of the TYK2 molecules was fragmented using fragmenter. The WBO threshold was set to 0.03
and all non-rotor ring substituents were removed. A new smirks pattern was then made for each unique combination of 4 atoms running through the central bond fragmented around. 1D Torsion drives were then ran for each rotatable central bond on the public QCArchive using our default specification.
The smirks pattern included atoms up to 1 bond away from the 4 atoms in the dihedral to ensure that the patterns were transferable between fragments but were unique enough to not be miss applied. This is important as it ensures that smirks patterns are optimized in the presence of the parameters they will be used within production and allows self-consistent optimisation of interdependent torsions.
The Forcebalance target TorsionProfile was used with the L1 penalty. The upper energy limit was changed to 10 from 5 and the k value prior was set to 6. This is important as it allows the parameters to move sufficiently away from the starting values in cases where the original guess is poor. What's more due to the introduction of many more torsion terms (parsley uses just 4 k values compared to the 68 made in the bespoke workflow) this also stops the regularization term dominating the error and allows for a much better fit.
Molecule 7
This molecule is a particularly challenging case for the bespoke torsion workflow due to the use of fragmentation resulting in two very similar molecules.
The difficulty here is that the bespoke smirks made for the fragments can easily be misapplied between the torsions. In this case, the bespoke torsion for the O=C-N-H
torsion even with an extra layer of atoms is the same for both fragments. For example, the pattern is [#8AH0X1x0!r+0:1]=!@[#6AH0X3x0!r+0:2](-!@[#6aH0X3x2r6+0])-!@[#7AH1X3x0!r+0:3](-!@[#6aH0X3x2r6+0])-!@[#1AH0X1x0!r+0:4]
and the the atoms matched by this in the fragments are shown below.
However, due to each unique dihedral passing through a central bond getting its own smirks the other dihedral terms are unique to the fragments and offer the extra degrees of freedom required to get a good bespoke fit for each fragment. In this case one of the other smirks is [#6aH0X3x2r6+0:1](:@[#6aH1X3x2r6+0])(:@[#6aH1X3x2r6+0])-!@[#6AH0X3x0!r+0:2](=!@[#8AH0X1x0!r+0])-!@[#7AH1X3x0!r+0:3](-!@[#1AH0X1x0!r+0])-!@[#6aH0X3x2r6+0:4](:@[#6aH1X3x2r6+0]):@[#7aH0X2x2r6+0]
which hits the following atoms.
Validation
In order to validate any performance changes due to the refitting, a simple conformer energy ranking test was used. After running the bespoke workflow high-temperature MD was run on molecule 7 and each of the fitting fragments (using openff_unconstrained-1.0.0
). 1000 confirmations were then extracted and the single point energies were calculated for all 3 molecules on a local QCArchive instance using multiple specifications (default(B3LYP-D3BJ/DZVP), openff-1.0.0/1.2.1/1.3.0 gaff-2.11 and ani2x, all mm specs were unconstrained.) The single point energies were then also recalculated using the bespoke forcefield output at the end of the bespoke workflow.
The Improvement of Openff
Here we see that OpenFF has consistently improved with each release in this test however Gaff-2.11 has the best performance across all of the molecules.
All force fields agree with QM for the parent molecule on the lowest energy structure.
All force fields disagree with QM on fragment 1 on the lowest energy structure
OpenFF-1.0.0 is the only force field to disagree with QM on the lowest energy structure this is however corrected in 1.2.1 and 1.30.
Molecule | Openff-1.0.0 | Openff-1.2.1 | Openff-1.3.0 | Gaff-2.11 |
---|---|---|---|---|
Slope 0.546
RMSE 8.575 kcal/mol
Intercept 7.772
Std Error 0.014 kcal/mol | Slope 0.594
|