Excerpt |
---|
Initial plan and approach: unconstrained fit to existing data |
👀 Overview
Summary | This initial approach started from the scripts and data left by Simon. Unfortunately, re-training vdW properties showed worse performance for Br molecules and pyridine molecules. Some experimentation with splitting out virtual sites showed that splitting the N vdW term into two improved performance. On examining the charges, it was observed that the charge of the pyridine virtual site was >1 in magnitude. Digging into it, the documentation left by Simon was slightly incomplete and resulted in only training to some of the data, so a new re-fit and approach was started in Approach II. | ||||||
---|---|---|---|---|---|---|---|
GitHub repo/branch | |||||||
Status |
|
Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
\uD83D\uDEA9 Milestones and metrics
Stage | Milestone/Benchmark | Contributors | Deadline | Status | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Train virtual sites and BCCs to existing HF/6-31G* data | Re-fit 2.1.0 BCCs to ESP data | July 2023 |
| |||||||
Re-fit 2.1.0 BCCs and virtual sites to ESP data | July 2023 |
| ||||||||
Re-fit valence and vdW parameters to condensed phase properties | Re-fit FF terms for no-vsites-candidate | Aug 2023 |
| |||||||
Re-fit FF terms for vsites-candidate | Aug 2023 |
| ||||||||
Benchmark | Improved or equivalent performance for molecules with virtual sites added (Cl, Br, pyridines) on training data | Sept 2023 |
| |||||||
Experiment with vdW site splitting to see if that improves benchmarks | Sept 2023 |
| ||||||||
Experiment with fitting to dimer energies | Sept 2023 |
| ||||||||
Benchmark | Improved or equivalent performance for molecules with virtual sites added (Cl, Br, pyridines) on training data | Sept 2023 |
|
📊 Progress and findings
Training data
The training data for each virtual site is as follows:
C-Cl (3280 training ESPs)
"[#6A:2]-[#17:1]"
"[#6a:2]-[#17:1]"
C-Br (2149 training ESPs)
"[#6A:2]-[#35:1]"
"[#6a:2]-[#35:1]"
Lone pair off N (796 training ESPs)
"[#6X3H1a:2]1:[#7X2a:1]:[#6X3H1a:3]:[#6X3a]:[#6X3a]:[#6X3a]1"
Training to QM ESPs
...
The training appears to converge after ~1000 epochs or so.
Training to condensed phase properties
Molecules with Cl improve with retraining to QM ESPs, and improve minorly with additional virtual sites.
...
Unfortunately, properties with bromine and pyridine virtual sites have a decrease in performance.
Splitting out vdW terms
Splitting the N vdW term into two, and fitting to dimer energies resulted in improved performance on training data.
...