Sage rc1 refit including dihedral_rmsd in optgeo target

Including the dihedral rmsd in the optgeo target’s objective function by making the dihedral_denom non-zero (a value of 10.0 degrees used here) improved the overall RMSD and TFD slightly when compared to Sage rc1. There is a slight degradation in ddE.

Here are the benchmark plots on the Lim-Hahn optimization set:

Differences in parameters

Δ_Angles:

The maximum delta in the equilibrium angles is 2.95 degrees, and the maximum difference in k value is 37 units, here are those parameters

a36 [#6X3:1]-[#16X2:2]-[#6X4:3] 2.96 deg a38 [*:1]~[#15:2]~[*:3] 37.06 kcal/(mol rad**2)

Percentage difference in values greater than 10%:

a7 2.30 degrees, change in value 2% 12.35 kilocalorie/(mole*radian**2) , change is 16% a8 -2.57 degrees, change in value -2% -17.33 kilocalorie/(mole*radian**2) , change is -17% a9 -0.82 degrees, change in value -1% -17.51 kilocalorie/(mole*radian**2) , change is -19% a12 -2.47 degrees, change in value -2% 10.98 kilocalorie/(mole*radian**2) , change is 24% a13 0.34 degrees, change in value 0% 3.54 kilocalorie/(mole*radian**2) , change is 12% a33 1.66 degrees, change in value 2% -24.70 kilocalorie/(mole*radian**2) , change is -29% a36 2.96 degrees, change in value 3% 12.29 kilocalorie/(mole*radian**2) , change is 14% a37 -1.30 degrees, change in value -1% 19.24 kilocalorie/(mole*radian**2) , change is 16% a38 -0.87 degrees, change in value -1% -37.06 kilocalorie/(mole*radian**2) , change is -29%

 

Δ_Bonds:

Not much difference, the maximum difference is in the bond length is 0.05 Angstrom, and the maximum difference in k value is 23 units, here are the parameters

b55 [#16X4,#16X3:1]-[#8X2:2] 0.052 A b62 [#15:1]~[#8X2:2] 23.30 kcal/(A**2 mol) ---> corresponds to 4% change in value

All of the bond parameter changes are insignificant even in percentages,

Δ_Torsions:

Noticeable differences in torsion parameters, here is a list of parameters that show at least 0.5 kcal/mol difference in the k-values

[torsion_id, smirks_pattern, difference in k_values list_in_units_of_kcal_per_mol]

Percentage difference in any of the k-values is greater than 50%:

 

Inspecting some sulfonamide molecules

  1. Here is one molecule that shows a good improvement with inclusion of dihedral_rmsd, wherein the hydrogens on the left NH2 are aligned properly in case of the refit.

    1a) 421_C6H7N2O5S2 in 1.2.0 training set
    assigned parameters: ['a10', 'a11', 'a18', 'a30', 'a31',
    'b14a', 'b5', 'b53', 'b54', 'b56', 'b84', 'b86',
    'c1', 'i1', 'i2a', 'n11', 'n14', 'n17', 'n20', 'n21', 'n7',
    't112', 't136', 't139', 't44']
    Green - QM, Cyan - Sage rc1

 

Green - QM, Magenta - Sage refit w/ dihedral_rmsd

 

 

1b) 937_C15H20N2O3S in 1.2.0 training set

assigned parameters: ['a1', 'a10', 'a11', 'a17', 'a18', 'a19', 'a2', 'a30', 'a31',
'b1', 'b10', 'b2', 'b20', 'b3', 'b5', 'b53', 'b54', 'b56', 'b7', 'b83', 'b84', 'b86', 'b9',
'c1', 'i1', 'i2a', 'i3', 'n11', 'n14', 'n16', 'n17', 'n2', 'n20', 'n21', 'n3', 'n7',
't1', 't109', 't135', 't137', 't139', 't140', 't17', 't18', 't2', 't20', 't3', 't4', 't44', 't51', 't59', 't62', 't69a', 't70b']

 

Green - QM, Cyan - Sage rc1

Green - QM, Magenta - Sage refit w/ dihedral_rmsd

 

1c) 101_C4H10N2OS in 1.2.0 training set

assigned parameters: ['a1', 'a17', 'a18', 'a2', 'a23', 'a3', 'a30', 'a31', 'a4', 'a6',
'b1', 'b53', 'b54', 'b56', 'b7', 'b83', 'b86',
'c1', 'i2a', 'n11', 'n16', 'n17', 'n2', 'n20', 'n21', 'n3',
't109', 't134', 't135', 't137', 't139', 't140', 't145', 't15', 't16', 't54', 't55', 't56', 't57']

 

Green - QM, Cyan - Sage rc1

Green - QM, Magenta - Sage refit w/ dihedral_rmsd

 

List of some systems (some are conformers) that have a high dihedral RMSD of > 40 degrees among sulfonamides from 1.2.0 targets that showed an improvement with the refit



System

Bonds

Angles

Dihedrals

Impropers





RMSD

RMSD

RMSD

RMSD

Sage rc1

592_C6H7N2O5S2

0.033

2.95

41.93

0.81

refit w/ dih_rmsd

592_C6H7N2O5S2

0.036

3.08

7.54

0.77













Sage rc1

937_C15H20N2O3S

0.010

2.00

40.13

0.76

refit w/ dih_rmsd

937_C15H20N2O3S

0.011

1.79

8.67

0.93













Sage rc1

415_C6H7N2O5S2

0.033

2.95

42.09

0.79

refit w/ dih_rmsd

415_C6H7N2O5S2

0.036

3.08

7.69

0.75













Sage rc1

911_C15H20N2O3S

0.010

2.00

40.17

0.76

refit w/ dih_rmsd

911_C15H20N2O3S

0.011

1.79

8.73

0.93













Sage rc1

421_C6H7N2O5S2

0.040

7.86

44.56

1.54

refit w/ dih_rmsd

421_C6H7N2O5S2

0.043

7.41

16.30

1.42













Sage rc1

101_C4H10N2OS

0.032

4.23

42.64

1.32

refit w/ dih_rmsd

101_C4H10N2OS

0.029

3.88

11.41

1.14













Sage rc1

842_C6H7N2O5S2

0.033

2.97

41.68

0.72

refit w/ dih_rmsd

842_C6H7N2O5S2

0.035

3.08

7.69

0.73













Sage rc1

720_C6H7N2O5S2

0.033

2.97

41.69

0.72

refit w/ dih_rmsd

720_C6H7N2O5S2

0.035

3.08

7.69

0.73

 

2. Here is a molecule that shows a slight degradation:

1676_C7Cl2H5N3O2S3 in 1.2.0 training set
['a1', 'a10', 'a14', 'a19', 'a2', 'a20', 'a22', 'a30', 'a31', 'a35',
'b13', 'b2', 'b34', 'b4', 'b49', 'b53', 'b54', 'b56', 'b6', 'b69', 'b8', 'b83', 'b84', 'b86',
'c1', 'i1', 'i3', 'i4', 'n11', 'n14', 'n16', 'n17', 'n2', 'n20', 'n21', 'n24', 'n7',
't106', 't108', 't112', 't129', 't136', 't139', 't141', 't143', 't20', 't24', 't43', 't45', 't69a', 't70', 't77']

 

Green - QM, Cyan - Sage rc1

 

Green - QM, Magenta - Sage refit w/ dihedral_rmsd

The above molecule and another conformer of it showed a higher dihedral RMSD with the refit



System

Bonds

Angles

Dihedrals

Impropers





RMSD

RMSD

RMSD

RMSD

Sage rc1

1676_C7Cl2H5N3O2S3

0.019

2.70

28.16

1.48

refit w/ dih_rmsd

1676_C7Cl2H5N3O2S3

0.020

3.08

40.47

1.24













Sage rc1

1659_C7Cl2H5N3O2S3

0.019

2.70

28.31

1.49

refit w/ dih_rmsd

1659_C7Cl2H5N3O2S3

0.020

3.08

40.38

1.25

 

Torsiondrives in Sage training set matching
[t26, t31, t64, t118, t135, t139, t154]

For the torsions listed above that show greater than 0.5 kcal/mol difference with the refit plotting the torsion profiles for the molecules in Sage training set. All residuals overlap in both cases, not much improvement/detriment in full MM torsion profiles or their RMSDs except in one or two cases below. This looks like a case of multiple minima, where the optimized geometries are improved with the inclusion of dihedral rmsd in optgeo target’s objective function but with a change in the torsion parameter values no significant change observed.

Id

param

chemical structure

Residuals

Relative energies

RMSDs wrt QM

Id

param

chemical structure

Residuals

Relative energies

RMSDs wrt QM

18536987

t26

 

 

 

 

18886218

t26

 

 

 

 

18886217

t26

 

 

 

 

18886214

t26

 

 

 

 

18536986

t26

 

 

 

 

18886215

t26

 

 

 

 

18536998

t31

 

 

 

 

18886228

t31

 

 

 

 

18536052

t31

 

 

 

 

18535855

t64

 

 

 

 

18886290

t64

 

 

 

 

18537049

t64

 

 

 

 

18886280

t64

 

 

 

 

18536071

t64

 

 

 

 

18045648

t64

 

 

 

 

18535856

t64

 

 

 

 

18886474

t118

 

 

 

 

6098576

t118

 

 

 

 

18537139

t118

 

 

 

 

18537138

t118

 

 

 

 

18886475

t118

 

 

 

 

18535893

t135

 

 

 

 

18886517

t135

 

 

 

 

18886515

t135

 

 

 

 

2703135

t135

 

 

 

 

18886530

t139

 

 

 

 

18536128

t154