Bridgehead Nitrogens (Vehicle set)
Heterocycles which contain bridgehead Nitrogens and heteroatoms adjacent to those have pyramidal structure, not captured by current parameter set. For example, here is one molecule
ID Indices QM Sage
i2 (0, 4, 12, 13) -172.474 -179.686
i1 (1, 2, 6, 10) 176.581 179.584
i1 (1, 3, 7, 11) -174.699 179.182
i1 (2, 1, 3, 9) -178.002 179.364
i4 (2, 6, 8, 15) 172.991 -156.088
i4 (3, 7, 8, 13) 131 179.318
i1 (4, 0, 5, 14) -171.789 -179.605
i5 (5, 8, 6, 7) -128.377 -166.444
QM on left and Sage optimized structure on the right
Sage re-optimized for this single target with regular process doesn’t fix the problem entirely
If the dihedral deviations are included in the opt_geo target (Vehicle is an optimization dataset) then there is a good overlap as below (Yellow is MM and Green is QM)
Changes in forcefield parameters for the above (Yellow) optimization are:
b4 0.04 angstroms, change in value is 2% 0.45 kilocalorie/(angstrom**2*mole) 0%
b10 0.00 angstroms, change in value is 0% 0.12 kilocalorie/(angstrom**2*mole) 0%
b13 0.01 angstroms, change in value is 0% 0.12 kilocalorie/(angstrom**2*mole) 0%
b20 0.01 angstroms, change in value is 1% 0.51 kilocalorie/(angstrom**2*mole) 0%
b21 0.02 angstroms, change in value is 1% 0.09 kilocalorie/(angstrom**2*mole) 0%
b34 0.01 angstroms, change in value is 1% 0.28 kilocalorie/(angstrom**2*mole) 0%
b35 0.01 angstroms, change in value is 1% 0.13 kilocalorie/(angstrom**2*mole) 0%
b41 0.01 angstroms, change in value is 1% 0.84 kilocalorie/(angstrom**2*mole) 0%
b85 0.00 angstroms, change in value is 0% 0.00 kilocalorie/(angstrom**2*mole) 0%
b87 0.00 angstroms, change in value is 0% 0.00 kilocalorie/(angstrom**2*mole) 0%
a10 -5.96 degrees, change in value -5% -4.10 kilocalorie/(mole*radian**2) , change is -4%
a11 -16.45 degrees, change in value -13% -3.06 kilocalorie/(mole*radian**2) , change is -4%
a15 -2.50 degrees, change in value -2% -0.02 kilocalorie/(mole*radian**2) , change is -0%
a20 14.45 degrees, change in value 12% -7.12 kilocalorie/(mole*radian**2) , change is -6%
a21 -15.18 degrees, change in value -13% -1.72 kilocalorie/(mole*radian**2) , change is -1%
a22 5.43 degrees, change in value 5% -0.21 kilocalorie/(mole*radian**2) , change is -0%
a29 14.96 degrees, change in value 13% -0.63 kilocalorie/(mole*radian**2) , change is -0%
['t47', '[*:1]~[#6X3:2]-[#6X3$(*=[#8,#16,#7]):3]~[*:4]', ['-96%'], 'ff_1 params: ', ['1.00'], ', ff_2 params:', ['0.04']]
['t75', '[*:1]-[#7X3:2]-[#6X3$(*=[#8,#16,#7]):3]~[*:4]', ['-90%', ' 342%'], 'ff_1 params: ', ['1.81', '-0.05'], ', ff_2 params:', ['0.18', '-0.21']]
['t76', '[#1:1]-[#7X3:2]-[#6X3:3]=[#8,#16,#7:4]', ['-52%', ' 7%'], 'ff_1 params: ', ['0.49', '0.89'], ', ff_2 params:', ['0.23', '0.96']]
['t127', '[*:1]~[#8X2:2]-[#7:3]~[*:4]', ['-121%'], 'ff_1 params: ', ['1.02'], ', ff_2 params:', ['-0.21']]
['t134', '[*:1]-[#7X4,#7X3:2]-[#7X3$(*~[#6X3,#6X2]):3]~[*:4]', [' 15769%'], 'ff_1 params: ', ['0.02'], ', ff_2 params:', ['3.29']]
['t138', '[*:1]~[#7X2:2]-[#7X3:3]~[*:4]', [' 1325%', ' 24%'], 'ff_1 params: ', ['-0.10', '1.55'], ', ff_2 params:', ['-1.36', '1.93']]
i1 [1.1] to ['-0.48'] [' 43%']
i2 [10.5] to ['0.14'] ['-1%']
i4 [1.0] to ['-0.38'] [' 38%']
i5 [1.1] to ['-0.37'] [' 34%']
Other top offending cases are in this file
A subset of improper where this behavior is most seen can be characterized by defining the following improper smarts pattern:
i4b: "[#6:1]~[#7X3H1:2](~[#6$(*~[!#8X1]),#7:3])~[*:4]"
i5a: "[!#6:1]~[#7x3$(*-[#7x3,#6x3]~[#7,#8,#16]):2](~[#7,#8,#16:3])~[*:4]"
A corresponding general torsion term, placed after the Nitrogen-Nitrogen central bond torsions, would also help
Iteration 1: Adding these extra parameters and optimizing helped a little bit but still the ring puckering is not captured well enough.
The changes in parameters are
Geometry overlap (Green - QM; Yellow - MM with optimized new params) which still shows a slightly planar left ring and
Iteration 4: Just adding two new general torsion parameters brings it to acceptable thresholds, there are still large disagreements in dihedral angles but the structure at least it is not planar. Strangely, the parameters starting from their parent values didn’t at all even though there are geometry deviations (iters 2, 3), and starting from zero k-values optimized them well. The impropers are not contributing anything to the changes.
Optimized values are
Since this is not so common chemistry we can just include these general parameters and not worry about complete agreement.