Sage rc1 refit including dihedral_rmsd in optgeo target
Including the dihedral rmsd in the optgeo target’s objective function by making the dihedral_denom non-zero (a value of 10.0 degrees used here) improved the overall RMSD and TFD slightly when compared to Sage rc1. There is a slight degradation in ddE.
Here are the benchmark plots on the Lim-Hahn optimization set:
Differences in parameters
Δ_Angles:
The maximum delta in the equilibrium angles is 2.95 degrees, and the maximum difference in k value is 37 units, here are those parameters
a36 [#6X3:1]-[#16X2:2]-[#6X4:3] 2.96 deg
a38 [*:1]~[#15:2]~[*:3] 37.06 kcal/(mol rad**2)
Percentage difference in values greater than 10%:
a7 2.30 degrees, change in value 2% 12.35 kilocalorie/(mole*radian**2) , change is 16%
a8 -2.57 degrees, change in value -2% -17.33 kilocalorie/(mole*radian**2) , change is -17%
a9 -0.82 degrees, change in value -1% -17.51 kilocalorie/(mole*radian**2) , change is -19%
a12 -2.47 degrees, change in value -2% 10.98 kilocalorie/(mole*radian**2) , change is 24%
a13 0.34 degrees, change in value 0% 3.54 kilocalorie/(mole*radian**2) , change is 12%
a33 1.66 degrees, change in value 2% -24.70 kilocalorie/(mole*radian**2) , change is -29%
a36 2.96 degrees, change in value 3% 12.29 kilocalorie/(mole*radian**2) , change is 14%
a37 -1.30 degrees, change in value -1% 19.24 kilocalorie/(mole*radian**2) , change is 16%
a38 -0.87 degrees, change in value -1% -37.06 kilocalorie/(mole*radian**2) , change is -29%
Δ_Bonds:
Not much difference, the maximum difference is in the bond length is 0.05 Angstrom, and the maximum difference in k value is 23 units, here are the parameters
b55 [#16X4,#16X3:1]-[#8X2:2] 0.052 A
b62 [#15:1]~[#8X2:2] 23.30 kcal/(A**2 mol) ---> corresponds to 4% change in value
All of the bond parameter changes are insignificant even in percentages,
Δ_Torsions:
Noticeable differences in torsion parameters, here is a list of parameters that show at least 0.5 kcal/mol difference in the k-values
[torsion_id, smirks_pattern, difference in k_values list_in_units_of_kcal_per_mol]
Percentage difference in any of the k-values is greater than 50%:
Inspecting some sulfonamide molecules
Here is one molecule that shows a good improvement with inclusion of dihedral_rmsd, wherein the hydrogens on the left NH2 are aligned properly in case of the refit.
1a) 421_C6H7N2O5S2 in 1.2.0 training set
assigned parameters: ['a10', 'a11', 'a18', 'a30', 'a31',
'b14a', 'b5', 'b53', 'b54', 'b56', 'b84', 'b86',
'c1', 'i1', 'i2a', 'n11', 'n14', 'n17', 'n20', 'n21', 'n7',
't112', 't136', 't139', 't44']
Green - QM, Cyan - Sage rc1
Green - QM, Magenta - Sage refit w/ dihedral_rmsd
1b) 937_C15H20N2O3S in 1.2.0 training set
assigned parameters: ['a1', 'a10', 'a11', 'a17', 'a18', 'a19', 'a2', 'a30', 'a31',
'b1', 'b10', 'b2', 'b20', 'b3', 'b5', 'b53', 'b54', 'b56', 'b7', 'b83', 'b84', 'b86', 'b9',
'c1', 'i1', 'i2a', 'i3', 'n11', 'n14', 'n16', 'n17', 'n2', 'n20', 'n21', 'n3', 'n7',
't1', 't109', 't135', 't137', 't139', 't140', 't17', 't18', 't2', 't20', 't3', 't4', 't44', 't51', 't59', 't62', 't69a', 't70b']
Green - QM, Cyan - Sage rc1
Green - QM, Magenta - Sage refit w/ dihedral_rmsd
1c) 101_C4H10N2OS in 1.2.0 training set
assigned parameters: ['a1', 'a17', 'a18', 'a2', 'a23', 'a3', 'a30', 'a31', 'a4', 'a6',
'b1', 'b53', 'b54', 'b56', 'b7', 'b83', 'b86',
'c1', 'i2a', 'n11', 'n16', 'n17', 'n2', 'n20', 'n21', 'n3',
't109', 't134', 't135', 't137', 't139', 't140', 't145', 't15', 't16', 't54', 't55', 't56', 't57']
Green - QM, Cyan - Sage rc1
Green - QM, Magenta - Sage refit w/ dihedral_rmsd
List of some systems (some are conformers) that have a high dihedral RMSD of > 40 degrees among sulfonamides from 1.2.0 targets that showed an improvement with the refit
System | Bonds | Angles | Dihedrals | Impropers | |
RMSD | RMSD | RMSD | RMSD | ||
Sage rc1 | 592_C6H7N2O5S2 | 0.033 | 2.95 | 41.93 | 0.81 |
refit w/ dih_rmsd | 592_C6H7N2O5S2 | 0.036 | 3.08 | 7.54 | 0.77 |
Sage rc1 | 937_C15H20N2O3S | 0.010 | 2.00 | 40.13 | 0.76 |
refit w/ dih_rmsd | 937_C15H20N2O3S | 0.011 | 1.79 | 8.67 | 0.93 |
Sage rc1 | 415_C6H7N2O5S2 | 0.033 | 2.95 | 42.09 | 0.79 |
refit w/ dih_rmsd | 415_C6H7N2O5S2 | 0.036 | 3.08 | 7.69 | 0.75 |
Sage rc1 | 911_C15H20N2O3S | 0.010 | 2.00 | 40.17 | 0.76 |
refit w/ dih_rmsd | 911_C15H20N2O3S | 0.011 | 1.79 | 8.73 | 0.93 |
Sage rc1 | 421_C6H7N2O5S2 | 0.040 | 7.86 | 44.56 | 1.54 |
refit w/ dih_rmsd | 421_C6H7N2O5S2 | 0.043 | 7.41 | 16.30 | 1.42 |
Sage rc1 | 101_C4H10N2OS | 0.032 | 4.23 | 42.64 | 1.32 |
refit w/ dih_rmsd | 101_C4H10N2OS | 0.029 | 3.88 | 11.41 | 1.14 |
Sage rc1 | 842_C6H7N2O5S2 | 0.033 | 2.97 | 41.68 | 0.72 |
refit w/ dih_rmsd | 842_C6H7N2O5S2 | 0.035 | 3.08 | 7.69 | 0.73 |
Sage rc1 | 720_C6H7N2O5S2 | 0.033 | 2.97 | 41.69 | 0.72 |
refit w/ dih_rmsd | 720_C6H7N2O5S2 | 0.035 | 3.08 | 7.69 | 0.73 |
2. Here is a molecule that shows a slight degradation:
1676_C7Cl2H5N3O2S3 in 1.2.0 training set
['a1', 'a10', 'a14', 'a19', 'a2', 'a20', 'a22', 'a30', 'a31', 'a35',
'b13', 'b2', 'b34', 'b4', 'b49', 'b53', 'b54', 'b56', 'b6', 'b69', 'b8', 'b83', 'b84', 'b86',
'c1', 'i1', 'i3', 'i4', 'n11', 'n14', 'n16', 'n17', 'n2', 'n20', 'n21', 'n24', 'n7',
't106', 't108', 't112', 't129', 't136', 't139', 't141', 't143', 't20', 't24', 't43', 't45', 't69a', 't70', 't77']
Green - QM, Cyan - Sage rc1
Green - QM, Magenta - Sage refit w/ dihedral_rmsd
The above molecule and another conformer of it showed a higher dihedral RMSD with the refit
System | Bonds | Angles | Dihedrals | Impropers | |
RMSD | RMSD | RMSD | RMSD | ||
Sage rc1 | 1676_C7Cl2H5N3O2S3 | 0.019 | 2.70 | 28.16 | 1.48 |
refit w/ dih_rmsd | 1676_C7Cl2H5N3O2S3 | 0.020 | 3.08 | 40.47 | 1.24 |
Sage rc1 | 1659_C7Cl2H5N3O2S3 | 0.019 | 2.70 | 28.31 | 1.49 |
refit w/ dih_rmsd | 1659_C7Cl2H5N3O2S3 | 0.020 | 3.08 | 40.38 | 1.25 |
Torsiondrives in Sage training set matching
[t26, t31, t64, t118, t135, t139, t154]
For the torsions listed above that show greater than 0.5 kcal/mol difference with the refit plotting the torsion profiles for the molecules in Sage training set. All residuals overlap in both cases, not much improvement/detriment in full MM torsion profiles or their RMSDs except in one or two cases below. This looks like a case of multiple minima, where the optimized geometries are improved with the inclusion of dihedral rmsd in optgeo target’s objective function but with a change in the torsion parameter values no significant change observed.
Id | param | chemical structure | Residuals | Relative energies | RMSDs wrt QM |
---|---|---|---|---|---|
18536987 | t26 |
|
|
|
|
18886218 | t26 |
|
|
|
|
18886217 | t26 |
|
|
|
|
18886214 | t26 |
|
|
|
|
18536986 | t26 |
|
|
|
|
18886215 | t26 |
|
|
|
|
18536998 | t31 |
|
|
|
|
18886228 | t31 |
|
|
|
|
18536052 | t31 |
|
|
|
|
18535855 | t64 |
|
|
|
|
18886290 | t64 |
|
|
|
|
18537049 | t64 |
|
|
|
|
18886280 | t64 |
|
|
|
|
18536071 | t64 |
|
|
|
|
18045648 | t64 |
|
|
|
|
18535856 | t64 |
|
|
|
|
18886474 | t118 |
|
|
|
|
6098576 | t118 |
|
|
|
|
18537139 | t118 |
|
|
|
|
18537138 | t118 |
|
|
|
|
18886475 | t118 |
|
|
|
|
18535893 | t135 |
|
|
|
|
18886517 | t135 |
|
|
|
|
18886515 | t135 |
|
|
|
|
2703135 | t135 |
|
|
|
|
18886530 | t139 |
|
|
|
|
18536128 | t154 |
|
|
|
|