2020-03-19 Force Field Release meeting notes
Date
Mar 19, 2020
Participants
@Simon Boothroyd
@David Mobley
@Lee-Ping Wang
@Hyesu Jang
Xavier Lucas
@Jeffrey Wagner
@Jessica Maat (Deactivated)
@Christopher Bayly
Victoria Lim
Goals
Discussion topics
Time | Item | Presenter | Notes |
---|---|---|---|
| update on physical property data | SB and OM | |
| Designing 2nd generation QM training set update |
| |
|
| VL |
|
Note Taking
(1) Simon;
Add link for simon’s slides here
Slide 3
DLM – The OH hydrogen radius parameter is generally zero in other FFs. We are willing to consider changing it, but just remember the history.
SB – We let in change from zero in Parsley, but it stayed very small
Slide 7
LPW – was there a way to deal with high torsional barrier in acids and esers, and its effect on thermodynamic properties?
SB – We might add enhanced sampling to overcome those barriers, but we don’t do it yet.
SB – Relative contributions to the objective function from different properties – I set it so that the initial contributions from all the different property types are about equal in magnitude.
Slide 8
SB – Excess molar volume is SO NOISY that it screws up optimizations. We should NOT fit to excess molar volume.
Summary
Should just fit against enthalpy of mix, binary mass density, maybe pure density. Should expand to more than just alcohols and esters.
CIB – Using only alcohols and esters may bias our decision here. I can think of three categories of liquids
Alcohols are a 1:1 mox of donors and acceptors, esters have no donor
Dimethyl and trimethyl amines would have different characteristics like this.
A liquid like dimethylamine is a mix of donors and acceptors, versus acetone which is purely acceptor. I wonder if our dataset is biased to using a more diverse mixed systems in terms of fraction/characteristics of donors and acceptors.
Might also try liquids with no dipole, like pyridizine/pyrizine (where there are Ns opposite each other in the ring). The pure liquid there doesn’t have a dipole. So the liquid properties reflect vdW and quadropole interactions. Then, mixing those with something like dimethylamine would investigate an interesting set of liquid interactions.
SB – Completely agree
JRW – Do we know how we’ll operationally do the refit for this generation?
LPW – Probably same as the last generation – valence → nonbond → valence cleanup
JRW – How about “The valence team sends Simon a reasonable-looking valence-optimized FF on April 1 using incomplete data, so he can start test runs. The final QM calculations will be done April 20, and the final valence refit FF will be sent to Simon on April 25.”
(General) – We’ll decide on this later
LPW – How did computational cost of these calculations break down?
SB – Pure properties were cheap. Mixtures were expensive, but all about the same as each other. Thankfully they could share some parameter sets
CIB – In objective function, do some properties systematically pull properties in a single direction? Like, does hvap always pull radii smaller?
SB – Good point. I’d like to look closer at the gradients of the properties. That would be really interesting.
CIB – The gradient approach worked well when I did BCCs.
(2) JM and HJ
Slide 2
DLM – Clarify – DBSCAN clustering based on fingerprint similarity?
JM – Yes
Slide 3
XL – How do you generate tautomeric states?
HJ – Fragmenter/CMILES functions
DLM – Can send code later
CIB – How will we find whether these results are representative/what are benchmarking plans?
LPW – We have plans for benchmarking, but they’re not totally operational yet.
LPW – Since a single central bond can host many unique torsion parameters, we want to ensure that our dataset doesn’t always put certain torsions together, since then they’ll be always be fit together, and it will be unclear which contributions come from which torsions
Fingerprint type (slide 7)
CIB – MACCS keys are old, and I’m not a fan of them. Now there’s ECFP, Tree, Path, the which are really good (with all defaults). LINGO is interesting, but not particularly good in my opinion. Regarding DBSCAN, I’ve had a good experience using H-DBSCAN, which only has “one knob”, so I’d recommend H-DBSCAN if DBSCAN is being difficult. It’s also in scikit-learn.
HJ – When I tried MACCS, the clustering was a lot less dependent on epsilon.
Slide 9
CIB – I’m really impressed by this. I interpret from this that there are real differences between tree and MACCS keys. I trust this sort of investigation more than my own generic intuition. Let talk offline in more depth.
LPW – Want to confirm that JM’s optimization datasets can also be submitted to QCA before the end of the month. The geometry optimizations
JM – Could have everything ready for submission by tomorrow.