Time | Item | Presenter | Notes |
---|
Paper progress | Joshua Horton | JH – Narrative is “there are lots of FFs and tools to do torsion parameter assignment, but none of them are compatible with SMIRNOFF. So we’re making Bespokefit” DC – Also, in order to cover chemical space, most FFs end up having a parameter explosion. Sage has very few parameters, however this can lead to some inaccurate torsions. DC – Then reviews tools that are available for bespoke torsion scanning, and explaining that none are compatible with SMIRNOFF format. So first advantage is that this is compatible with SMIRNOFF. DC – Second advantage in the paper will talk about how things like ANI can be used to speed up bespoke torsion fitting. DC – Third advantage is that this can be done at a large scale, and the QM data can be published and reused.
JH – Results will have dG plots for JACS ligands. Evaluation of QM vs MM torsion profiles. (bar chart with plots for error between QM and MM energy surfaces for Sage, Sage+bespoke, and maybe GAFF)
DC – Venkat, anything we should include with ANI? V – I shared results a while ago with Josh, some ANI RMSE data from forcebalance. JH – That would be similar to the JACS stuff. We could include that. DC – Full JACS data would be great. V – I will discuss this offline with you.
|
Bespokefit release | Joshua Horton
| JH – 0.1.1 release is out! Good news is that we’ve got other people at the Cole lab using it, it’s installed ona few PCs. Had to add a few pins to the recipe for the ForceBalance break and Toolkit releases. But the environment seems solvable. Docs are up and look good, theory docs are almost ready to go. JH – I’ve reached out to BSwope but he was busy at ACS. I’ll reach out again. JW – Tried to do the quick start instructions on mac and had an issue, I’ll bring this up on the repo issue tracker. DC – Other potential testers? DM – Could recruit a small number of volunteers on the ad board channel SB – It may be good to give BSwope a few more days - He’s got the right level of expertise and patience to be our idea tester. DM – Xavier Lucas could be good as well.
|
A slightly tangential science remark about low hanging fruit | | DM – AS you get more data about BespokeFit, it would be great to collect data about how the bespoke parameters differ from parent parameters, and we could keep a running tally of how much the parent parameters get changed by bespoke fitting, and then we could aggregate this to identify parameters that need modification.
|
B/Si update | Daniel Cole | DC – We don’t really have transferrable ff parameters for compounds that contain B or Si. But the Reaxys database has allowed us to make public ~100 measurements of compounds involving B and Si. So they sent me their database of compounds with HVap data, and I’m reviewing which ones we want. My thinking is that we could use QM to come up with initial guesses for parameters, and then use their phys prop data for fitting. DC – So, I’ve looked at this dataset, and unfortunately I’m not great at chemistry. But I asked an ochemist who helped me look through the compounds. He said that SILICON is pretty straightforward, and selected a few compounds. DC – Many of the BORON molecules on the other hand, have some really reactive molecules. B-with-all-bonds-to-carbon are quite reactive. But B-with-at-least-one-bond-to-a-heteroatom are way better behaved. Many appearances of stable Boron are stabilized by an intermolecular interaction. So we may need to carefully communicate the domain of applicability for Boron parameters, since many molecules are unstable. DM – Many of the rings are weird (like, all B+O or B+N). Is that a stability issue or could we find “less weird” rings?
DM : “Some molecules I found on pubchem that are presumably stable” Image AddedB1=C(C(=C(C2=CC=CC=C21)C)C(=O)O)Cl I queried pubchem for [#5X3]:[*] SB – There are some issues with HVap measurements when actually taken in practice. We had encountered this issue while curating previous datasets. Will they have notes about experimental issues? SB – Will there be enough data for both testing and training? DC – We usually do 10-15 molecules per fit, we may be able to split that between test and train, but it would be narrow. Not sue how strict we need to be about separation of test and training sets in this case. DM – Reviewers would probably be happy with a sizable test/train set. DC – I’ll aim to do 10-15 in each of test+train.
PB – just dropping a link to NCI 250K QM dataset, https://github.com/openforcefield/qca-dataset-submission/blob/master/submissions/2019-07-05%20OpenFF%20NCI250K%20Boron%201/optimization_inputs.pdf DC – Are these optimized? JH – These are optimized using “gaussian” convergence criteria. To get them to optimize well locally we had to use a bigger basis set and switch to using “gaussian-tight” convergence criteria.
|