2021-11-17 Thompson/Wagner Check in

Participants

@Matt Thompson
@Jeffrey Wagner

Discussion topics

Item	Notes

Item	Notes
General updates	JW – Did we run out of VU meetings on the calendar? Did JW accidentally delete the event? MT – Not sure what happened, though both of us seem to be stalled unless we have meetings. So let’s start them up again. JW – Agree on all counts. JW DMs the VU crew JW – I really appreciate your responses on technical discussions and the issue/PR tracker. I had a moment of serious concern last week when I was helping a user with bespokefit and I got really concerned about technical debt in our stack. So I’m really hoping to focus on that after the biopolymer release. JW – Great blog post. MT – Thanks. A lot of my previous talk had been on things I did the night before. This time I got to talk about some months-old stuff that I’ve been thinking more about. Also, next time we have a big talk to give, maybe we should update the infrastructure flowcharts. One could be kinda simple - “How to use our force fields” (kinda linear, something like the VU flowchart but rejiggered) - and the other could be “how WE develop force fields” (which would be like the one with FB and evaluator, but with a lot more loops and additional detail) JW – I like that idea. I’d love to work on a fully detailed technical diagram of all this, even if it ends up being too complicated for a snappy presentation slide. MT – Should I wait on JMitchell review? JW – No need, feel free to ship it, I just want to make sure that we either tell people how to run the code blocks (like, a little installation snippet with a yaml) or we explicitly say that they shouldn’t try to run it. MT – Will do JW – I’ll tell JMitchell that he can propose changes if he sees anything egregiously wrong, but that he’d do so in a followup PR.
Interchange directions	MT – I was hoping to get more concrete feedback from the protein-ligand examples for interchange. Not clear which direction development should move in from here. JW – PIs have been talking about replacing forceblanace again, in connection with possible wider adoption of espaloma. This may overlap with your interest in learning/applying more ML-ish stuff and making a pathway to get interchange users (since this could lead to a pathway where the ONLY way to train ML FFs would be through interchange) MT – Where does espaloma fit? And what is a forcebalance replacement? JW – `from openff.toolkit.typing.engines.smirnoff` –> `from openff.toolkit.typing.engines.espaloma`. So, like, we'd have espaloma `ParameterHandlers`, but for cases of like WBO or partial charge calculation, maybe sometimes we treat it as a toolkitwrapper as well. MT – So, it’d load a bunch of ML model weights, and maybe also some OFFXML terms or something like that. JW – I wonder whether, for example, espaloma partial charge assignment should be via a ToolkitWrapper or an espaloma ParameterHandler? MT – I’m trying to wrap my head around how serialized models/forcefield ML models will look, and how we’ll ship them as force fields. General – There’s a distinction between “an espaloma model that we use to reproduce AM1BCC charges” and “an espaloma model that we’re training to produce good charges for simulation”. The former is generally IMMUTABLE and could happily live in a ToolkitWrapper, whereas the latter needs to be packaged “with” the force field, and be trainable through backpropagation. MT – I’m interested in leading the productionization of an ML model, whether it’s espaloma or a considerable rewrite. I’d strongly prefer a rewrite, and would allocate time to surveying the code and how it could be refactored/improved before committing to it.
PR clearance	MT – I’ve opened a PR to move evaluator to openff-units, SBoothroyd is reviewing and JSetiadi is providing feedback. It took longer than anticipated since Simon had found additional unit conversion edge cases that I hadn’t considered. JW – Looks great. Thanks for taking the lead on this!
Topology sync up	JW – Proposed new API point: `Topology.identify_chemically_identical_molecules()`. This will fix performance regression with `ToolkitAM1BCCHandler`, and should speed up parameter assignment for topologies with lots of identical solvent JW – Anticipate substantial speedup for `Molecule.from_pdb` using graphlib - that's up next. MT – I’d favor doing all isomorphism checks through graphtool, rather than replacing everything individually (General) – How will topology refactor work interact with unit package migration work? There will be a huge merge conflict at some point, question is “what’s the best way to handle it”? We’ll aim to have units package migration PR reviewed+ready in December We’ll merge the pint unit migration into the topology refactor branch We’ll release a prototype topology refactor package in Jan for user feedback and bug-finding We will have a soft feature freeze on the Toolkit `master` branch, for everything except critical bugs and internal blockers (General) – How should we update internal ecosystem in response to Feb API break? JW – One option would be to start pinning all new conda packages to toolkit<0.11 MT – Another option is to rebuild everything on the stack, though this could make old envs fail to solve JW – Three categories of deployment: Blog post/openff-benchmark release: COMPLETELY pinned environment New package release: Only GREATER THAN pins “I want to reproduce results from this old paper → `conda install openff-toolkit=0.9.0`": JW – Could have the single-file-installer builds also output an `environment.txt` with the EXACT version of each dep that it’s installed (look into a format that makes it easier to rehydrate envs in a single conda command) MT -- Agree JW – I’ll task @Josh Mitchell with making that kind of conda env in our installer action. MT – When we make the big breaking release, how do we want to coordinate updating OpenFF packages? Do we want to make those packages reverse-compatible? JW – I think we should probably not be reverse-compatible, since that would be really complex. General – This is unresolved - Can make the process a lot smoother with some planning, but working out scheduling/details will take more thought. JW – Not sure how to speed up SMARTS based parameter assignment for proteins MT – JW – Could discuss aspirational behavior of `Topology.from_pdb` today? JW – Still concerned about parameter application time.

Meetings

2021-11-17 Thompson/Wagner Check in

Participants

Discussion topics

Action items

Decisions