2022-06-08 All-hands meeting notes

Participants

Discussion topics

Item	Presenter	Notes
General updates	Jeffrey Wagner	Lily Wang is now acting FF lead. Working on US visa but will be in Australia for a few months, so general availability will be afternoon (starting at 2 or 3 PM US Pacific). Only 80% employed so will only regularly be available Mon-Thurs (Americas), though will attend meetings outside of these hours and on Friday if she’s involved in scheduling. Reminder that the annual meeting is on June 28 from 8-10 AM US Pacific, see OpenFF calendar. Interchange adoption is moving forward JW became approver in place of SB, “Part 1” is likely to be accepted this week. The final issue is that <1% of molecules sporadically get different partial charges, this seems to be run-to-run variation. New releases of Evaluator, merging up some old PRs - Thanks Jeff S, Jake A, Matt T, Simon B, Owen M for the updates! Additional clarification going into BespokeFit docs based on user feedback. We have a very active user in the UK (“smoking duck person”) Industry benchmarking paper has a new draft ready for review. See the #benchmarks-partners channel OpenFF Toolkit pre-alpha now loads PDBs faster, is temporarily incompatible with OpenEye for PDB loading (hope to fix this ASAP!)
Analysis of crystal observables	Chapin Cavender	CC – Deeper description of “Protein observables benchmarks” WBS. https://docs.google.com/presentation/d/1pgkQ1zuJ6Kl_7Km0tam96nyzMhFJIkPxPuQiV4P7Qis/edit#slide=id.g12c71da2858_0_0 CC – MS, you’d mentioned that a student of yours may be interested in working on biopolymer FF validation in the Fall. I’d love to talk more about this. MS – Do we feel like there’s a good sense of what should be done, and it’s just a matter of making it happen? CC – For the NMR observables there are pretty good precedents for how to do this work/best practices. For xtals the concepts are understood, but the actual code is mostly in the form of one-off scripts instead of available, reusable tools. MS – If we know what should be done then we should be good to go. MS – I think that xtal observables should be promoted to “should” instead of its current state of “could”. CC – I understand the goal of this project as being “making a FF comparable to/not inferior to ff14sb”. By that metric, we only “need” to validate by NMR. MS – I see the NMR observables as more of a “it’s convenient because the code already exists”, but that doesn’t actually propagate into value JW – So many things in OpenFF are being planned based on the completion date of Chapin’s work, that a small change in scope here that adds a little uncertainty will propagate into a large amount of uncertainty in most parts of the project. DN – We should probably look at the yellow items here as being “things that other people can do” CC – I’m taking personal responsibility for the timely completion of items in blue. If I have time after that, then I’ll work on the things in green. But the only way that the things in yellow will be completed will be if someone else does it.
Update on LJ optimization techniques	Owen Madin	Link slides here MS – Are there benchmarks for the mixture simulations yet, or are you still working on that? OM – The data that I have now has weird issues with hexanol. When I remove those, the performance on the mixture datasets show slight improvement for binary densities, and roughly the same results for binary enthalpies of mixing. JW – On slide 15, epsilons seem to always decrease in the good fit. Does this mean that things are too tightly packed? OM – Maybe. The RMSE of density does decrease significantly. MS – The epsilons don’t have much of an effect on density, that’s more sigma. CC – What is an “optimization iteration” on slide 15? OM – The surrogate objective line is the function…. (can’t follow, see recording) MS – What motivated all of this is that we eventually want to replace FB, and we are looking at this in the context of being a more flexible way to do general optimization. JS + MS – How general do we think this is? How transferrable are the generated parameters? How generalizable is this process? OM – The parameters from the pure-only optimizations I showed won’t be generalizable. Eg, on slide 13, we’re clearly overtraining to particular carbon+oxygen motifs. To make it more transferrable you’d need to have a wider training set or smarter regularization. The advantage of being able to test regularization on the surrogate level is that we can optimize the regularization parameters via iteration because the surrogate model will make evaluations so cheap. OM – For the second question, “scale up” - The results I showed today were looking at 12 dimensions, but Sage was more like 30 dimensions and 1000 targets. It will definitely be more difficult, but I can’t predict right now how much more difficult it will be. JS – Is this implemented outside ForceBalance? OM – Yes, it’s my own implementation that wraps evaluator for simulation-level evaluations and a “blowtorch” library for the surrogate stuff. JW – … OM – The `n` numbers on slide 12 are apples-to-apples for comparing methods. The `n` numbers are directly proportional to GPU hours. TH – How do you determine when the iterations have converged? OM – Right now it’s based on total compute budget. In the future I’d like to implement smarter criteria. MS – What’s the next step for this? Are people interested in making this more production ready, what are the next steps? Question for everyone. JW – I’m happy that this is a tangible example of an interface that we’d build towards. This is a big requirement that’s been missing in our efforts to replace. OM – The way that this code is put together is very “prototype”, so we’ll want to MS – One thing that this doesn’t handle yet is the “non-observables”, like the QM fitting. So that would need to be replaced if this will be a FB replacement. OM – Adding QM fitting to this won’t be fundamentally different. JW – I’ll talk to LW and DN about this, it’ll be a commitment of a few months of effort/a strategic decision, and right now we need to prioritize Rosemary and vsites. JS – Is surrogate modeling guaranteed to always find a minimum? Will every step decrease? OM – My experience is “yes”, but in some weird cases it doesn’t decrease. It’s not clear if this is because of usage errors/invalid use though. MT – On the topic of production-izing this, it’s important to communicate that a lot of our software is intended for stable use, so we have to tiptoe around things like API changes in the toolkit. I’d like to put this on a different footing, like a second tier of software of things for LW’s team’s use. JW – Agree, a new tier of “unstable but public facing” software will be great to have OM – Perfect.

Participants

Discussion topics

Action items

Decisions