1.2.0 fixes | Hyesu Jang | Insert slides link CIB – Amides and ureas are very different. A urea with planar substituents will have nonplanar nitrogen. Opposite with amides. CIB – Some of the structures in the torsion scans may show the 0 degree “hill” due to steric effects DM – the nitrogen at low dihedral values may be out of plane because of sterics (General) – What’s the problem with the hill in the middle and nonplanar amide Ns? DM – It looks like the important thing is to get ureas assigned a different torsion DM – Then we probably want to include more canonical amides in the training data. CB – So we see that we need to make a change to the prior width to get this right at all, but we’ll still have problems with the sterics. CB – Do we still need to keep the cutoff at 2 kcal/mol? HJ – I’d think we can raise it back to 5. DM – Agree LPW – Agree. The cutoff change was mostly for debugging. I know some folks want to increase the cutoff higher. But the point of fitting is to have the average point be right, so we nkow that many will under/overshoot. Our goal is to make the under/over-shoots be in energy ranges that we don’t care about. So we could consdier doing something like penalizing underestimates more than overestimates.
JW – Separate prior for impropers and propers? CIB – Excited about including smaller structures. CIB – I’m worried about interference between fitting nonbonded to liquid properties and valence to QM. Liquid properties will focus on attraction, not repulsion. So we must fit them simultaneously. LPW – Agree that we need a good strategy for co-optimization. Liquid property optimization and QM optimization are very different beasts. But I think we have a simple solution – Simon should mix in QM data. Could use workqueue on 10 nodes for QM data, and dask on lots of GPU nodes for nonbonded data. If this isn’t possible, then we could cut down the QM dataset so that simon can include it on the local node. CIB – two ideas 1) If we trim down the dataset for simon, we could pick ones with high steric energies. 2) We could allow the QM optimization to report the gradients and make the liquid optimization consider these
CIB – I have code that will calculate pairwise energy contributions – Will upload to Slack LPW – I think that idea 1 is good. I’m less certain about idea 2. CIB – I have a sterically congensted training set that may be helpful here – They should be in the Mobley lab somewhere.
|
Propyne issue | Chris Bayly | CIB – If it were up to me, I’d switch the values manually and do a release ASAP DM – This would go against our philosophy of not modifyin things by hand LPW – I’d be fine with doing this. Which specific changes? CIB – I put some proposals on striketeam – b24 should have (General) b24 should have 8.0e2, b27 should be 1.3e3 CIB – In aug 11 message, I’d looked at the OFFXML file for openff-1.2.0. What I recall seeing was that b24 has a high constant, like that of a triple bond. So I took the single bond and LPW – Could we do 1.0e3 for b27? CIB – What are other triple bond values? HJ – In 1.0.0, b24 was 700ish, and b27 was also 700ish. LPW – In S99F, both of these are 700.0 CIB – I’m mostly pinning my expectations to the IR spectrum, where the peak is at 2100.
JW – What is our process for deciding on these point releases? DM – So, when a major striketeam issue arises, we’ll begin a discussion in #internal, and assign it to a FF-release meeting for discussion. Anyone who shows up to that meeting gets to vote. LPW – My take on this is that we’ve only done it a few times, and our FFs are relatively new. For now, I’m OK with making quick fixes. In another year or two, I’d like a more formal procedure. CIB – My current assumption is that changes in the range of values that we’re looking at are insignificant. But does FB have a covariance matrix that would show us whether these changes are sensitive? LPW – When you say “low sensitivity”, I think “high prior width”. We’ll want to start discussing which numbers we’re using. So it’ll be good to present some ideas for how we could decide these upfront. When I think about the covariance matrix, I’d expect that we’d check out the variance of the parameters drawn from a distribution. But this distribution is influenced by the prior widths, so it’s circular . I’m thinking that a difference between a prior width of 100 and 120 isn’t a huge deal, but 100 and 200 might be, and 100 and 1000 definitely is. But I’d like to decide on this value in a data-driven way. So I think the bond length prior width should be 0.1 angstrom – This is how much I expect different bond orders between the same elements to affect things. But now that we’re including vibrational frequency, we might need a more careful look at how we set priors. If we had smaller fragments that covered some of these moieties, they’d likely wash out these issues. LPW – We could make good initial guesses at these numbers from IR spectra. CIB – IR was used to make the initial guesses for several FFs. LPW – I think these would need to be done together (changing our fitting approach and comparing to IR) DM – We could have lots of people getting trained at the same time as PB on FF fitting. CB – We could harvest initial guesses for k from an undergrad textbook on IR spectroscopy. LPW – For bond force constants, we could do this in an afternoon. For angles it’ll be trickier. (General) – We could take a Kollman-like approach by including really small molecules in the training set.
Decisions: We will modify 1.2.0 by hand to assign b24’s k to 800, b27’s k to 1000 → Rename to b24-MAN and b27-MAN → test against propyne containing molecule (jax 15 beta-secretase case from DH) → Release 1.2.1 JW will make reproducing case – BACE inhibiotr 24 from DH dataset, with HMR and 4fs timestep.
|