v1.3.0 release note
Current known issues of v1.2
Amide issue
Torsional energy profile of N-methyl acetamide amide bond generated using v1.2.0 failed to reproduce the QM torsional energy profile, forming a small hump at the minimum energy point, which doesn’t appear in the QM profile. Energy decoupling showed that torsion term contributes the most to the formation of the hump, which indicated re-fitting the related torsion parameters was needed.
From a closer inspection of the issue, we have found that there were some targets who are not planar at a point where amide bond is planar due to their large steric hindrance or other chemical interactions, having a small peak at the point in their QM torsional energy profiles. This led the parameter set fitted to the targets to force the formation of a small peak at around their planar geometries even for simple molecules which don't have strong steric hindrance.
So to avoid this problem, we carefully selected a training set for the amide torsion parameters with simple molecules which don’t have strong chemical interactions.
Dialkyl amide issue
Before this amide issue was raised, I found that there was a missing torsion term for dialkyl amides in the current parameter set and we’ve decided to add new amide-specific torsion parameters(child parameters of t69
and t70
). The preliminary study showed that the addition of the new torsion terms improved the performance of reproducing QM optimized geometries of dialkyl amides.
Newly added amide-specific torsion parameters
t69
[*:1]~[#7X3,#7X2-1:2]-!@[#6X3:3]~[*:4]
t69a
[*:1]-[#7X3:2]-[#6X3$(*=[#8,#16,#7]):3]~[*:4]
t70
[#1:1]-[#7X3:2]-[#6X3:3]=[#8,#16,#7:4]
t70b
[*:1]-[#7X3:2]-!@[#6X3:3](=[#8,#16,#7:4])-[#6,#1]
t70c
[#1:1]-[#7X3:2]-!@[#6X3:3](=[#8,#16,#7:4])-[#6,#1]
t70d
[*:1]-[#7X3:2]-!@[#6X3:3](=[#8,#16,#7:4])-[#7X3]
1. Training of the new amide parameters
Torsiondrive target selection scheme: planar geometries (C-center improper dihedral angle, N-center improper dihedral angle between -5.0, 5.0 degree at the minimum energy point) + geometries which have local minima at 0 and 180 degree.
Optimized geometry, vibrational frequencies targets selection: molecules having t69s or/and t70s
Fitting procedure:
run #1: fitted parameters to amide torsion training targets
run #2: (1) fitted parameters to amide torsion training targets, then (2) fitted to optimized geometries, vibrational frequencies with the torsion targets used in the first round fitting with a small prior width for the torsion parameters (1/10 of the default prior width).
run #3: (1) fitting parameters to amide torsion training targets with small prior widths for bond and angle parameters (for fitting mainly torsion parameters), then (2) fitted to optimized geometries, vibrational frequencies with the torsion targets used in the first round fitting with a small prior width for the torsion parameters (1/10 of the default prior width).
|
| run #1 | run #2 | run #3 |
---|---|---|---|---|
1st stg fitting (td targets) | priors | default | default | 1/10 of default for bond and angle parameters |
X2 | 2.16e+02 → 3.36e+01 | 2.16e+02 → 3.36e+01 | 2.16e+02 → 6.29e+01 | |
2nd stg fitting (opt. geo., vib. freq., td targets) | priors | - | 1/10 of default for torsion k (0.2 kcal/mol) | 1/10 of default for torsion k (0.2 kcal/mol) |
X2 | - | 6.12e+03 → 1.59e+03 | 1.72e+03 → 1.58e+03 |
2. Result
List of test sets:
Test set 1: (1) the simplest molecule, N-methyl acetamide torsional energy profile
Test set 2: all amide torsional energy profiles in the 2nd generation training dataset
Optimized geometry, relative conformational energy benchmark full set
torsional energy profile benchmark primary set
The following table lists the single point calculation objective function value of different parameter sets for each test set, which gives a rough measure of an overall quality of the calculation.
| Test 1 | Test 2 | optgeo, abinitio benchmark full set | torsional energy profile benchmark primary set |
---|---|---|---|---|
v1.2.0 | 2.69 | 24.99 | 16681.14 | 384.88 |
run #1 | 0.75 | 16.43 | 65098.87 | 329.21 |
run #2 | 1.77 | 27.32 | 16847.53 | 391.29 |
run #3 | 1.63 | 23.15 | 16436.44 | 404.51 |
while run #1, which only fitted to torsion targets, improves the performance in reproducing the QM torsional energy profiles(low objective function in test1,2 and torsion benchmark primary set), the performance in reproducing QM optimized geometries and relative conformational energies becomes worse compared to v1.2 (about 4 times higher objective function, marked in red).
To resolve this, in run #2 and run #3, two-stage fittings were done. In first stage, parameter set was fitted to the amide torsion targets(in run#3, torsional parameters were mainly fitted in the first stage), then in the second stage, fitting of the resulting parameter set to QM optimized geometries, vibrational frequencies along with the torsion training set(used in the first round) was done. This could reduce the high objective function value for the optgeo, abinitio benchmark full set.
(1) N-methyl acetamide
Torsional energy profile of N-methyl acetamide using re-fitted parameters and v1.2.1
Clear improvement in reproducing a torsional energy profile of N-methyl acetamide
(2) Dialkyl amides
Torsional energy profile of dialkyl amide using re-fitted parameters and v1.2.1
QM optimized geometry of Cc1ccc(=O)n(n1)CC(=O)N2CCCc3c2cccc3-20 (Magenta: MM optimized geometry with v1.2.0. Green: MM optimized geometry with run #2, Cyan:MM optimized geometry with run #3)
It also showed better reproduction of torsional energy profiles and optimized geometries of dialkyl amides compared to v1.2.0.
3. Conclusion
Although run #2 and #3 don’t show as drastic decrease of the objective function value in test1, test2 and torsion benchmark calculation as shown in run #1, since it resolves our main issues; (1) bad MM torsional energy profile of N-methyl acetamide amide bond and (2) a wrong description of optimized geometries and torsional energy profiles of dialkyl amides, while not hampering the performance in reproducing QM optimized geometries and relative conformational energies, run #2, #3 are better candidates compared to run #1.
run #3-> v1.3.0
4. Determination of priors width of valence parameters
The regularization scales(priors) for valence parameters used were determined based on the chemical intuition. Data-driven approach to determining the new regularization scales was carried out and the new regularization scales were used in the re-fitting procedure.
Distribution parameters in smirnoff99Frosst for each parameter type was first plotted.
Since the distributions are not bell-shaped, we decided to use IQR(interquartile range) values instead of standard deviation of the distributions.
For bond parameter, IQR was more than three times higher than the current prior(regularization scale) values. The large dispersions of bond k’s and lengths are largely due to the diversity of bond orders and element components of bond parameters.
So to eliminate the dependencies, bond parameters were classified by bond type and element compositions, subtracted the mean for each and the subtracted values were used in calculating IQR for the bond parameters. Also for the proper torsion parameters, distribution of sums of all the k’s of each torsion parameter was used in calculating IQR.
So the new priors based on the IQRs of the modified distributions were determined. (IQRs were rounded to multiples of 2 or 5.)
| Current Priors | stdev | IQR | New priors |
---|---|---|---|---|
Bond length (Å) | 0.1 | 0.33 | 0.1 | 0.1 |
Bond k (kcal/mol/Å2) | 100 | 249.30 | 87.5 | 100 |
Angle angle (degree) | 20 | 24.24 | 10.5 | 10 |
Angle k (kcal/mol/rad2) | 100 | 95.15 | 40.0 | 50 |
Torsion k (kcal/mol) | 1 | 1.32 | 2.3 | 2 |