2021-03-08 Core developers Meeting notes

Date

Mar 8, 2021

Participants

  • @Jeffrey Wagner

  • @David Hahn

  • @Simon Boothroyd

  • @Lorenzo D'Amore

  • @Matt Thompson

  • @David Dotson

  • @Pavan Behara

Discussion topics

Item

Notes

Item

Notes

Updates

  • DH

    • I will be transitioning to scientist position at Janssen – I’m introducing Lorenzo. He’ll be transitioning onto my work on in the PLbenchmaking repo and the pharma partner benchmarks.

    • I’ve been working on the benchmarking workflow – Putting together infrastructure to do OPLS 3e benchmarks.

    • Protein-ligand simulations are waiting on resources. Possibly that we could use folding@home

      • Want to add a GROMACS core to folding@home. In GROMACS 2021 they introduce GPU-accelerated PME, which should accelerate simulations a lot.

    • DD – For getting PLBenchmarks on F@H, I may be assigned to work on that in mid-summer. Would that timeline work for you?

      • DH – The summer schedule would be reasonable. I don’t know how complex this would be. LD, would this work for you?

      • LD – Probably.

      • DD – I’d plan on driving the infrastructure development, with LD leading the science one the simulations are able to run.

  • LD

    • Thanks, DH, for the introduction.

    • I’m working on getting benchmarking infrastructure set up, I only have access to an old mac, but I wasn’t able to update XCode remotely.

    • Expecting to get a computer from Janssen in the next few days.

    • I’m curious about the iodine issues in benchmarking. I’m wondering whether psi4 provides an error integral (difference between exact density and fit density) – This gives an estimate of whether the density fitting is appropriate for the atoms/whether the basis set is appropriate. Ideally the value of the error integral would be small.

      • LD will join Friday QC submission meeting

  • SB

    • Helped out with WBO fits

    • Led making a decision on whether to commit to WBO torsions in Sage, decided to push indefintialy until science was clear

    • Worked on BCC fits

    • Made my packages OpenEye-free, using OFF Toolkit interfaces. Would like to have streamed loading of molecules.

    • Coordinating with Josh Horton to convert openforcefield-forcebalance into bespokefit. This won’t aim to replace openforceifleds-forcebalance, but will be a second place where the functionality is accessible.

    • JW – Some non-reproducing error in

      • Is it OK if we make the new release with this?

        • SB – Yes, I was unable to reproduce this on any of my local machines

        • SB – There’s a general issue with deciding on how much we want to lean on OpenEye as “ground truth”, when their actual implementation is very complicated, and we’ll have to invest a lot into reverse-engineering if we continue with that. Maybe we should make AmberTools our “ground truth”

        • SB – Are there other packages that will let us do AM1 calculations? Maybe XTB?

        • PB – Would having other programs that can do AM1 calcs really fix this issue? Or is the core issue that OE is a black box?

        • SB – Having packages other than OE would give us flexibility and make us not reliant on OE’s “black box”

        • JW – I’d be in favor of making AmberTools our “ground truth”

        • MT – What does that mean?

        • SB – We’re probably going to continue using OE for exploratory work, since it’s much faster and easier to use. But the decision above could still let us do exploratory work using OE, but do the final fits using AmberTools.

        • MT – Will we want to rearrange the default toolkitregistry precedence?

        • SB – Probably not. JMaat is doing important work on seeing whether OE vs. AmberTools makes a siginificant difference WBOs. In the future I’d like to expand this analysis up to things like solvation free energies.

        • (General) When will we have a PI-involved decision on whether AT or OE is “ground truth” for AM1 calculations?

          • (General) A future call once we have clear evidence that WBOs improve performance

        • MT – How would this decision (changing fitting method / default toolkit precedence) affect different use cases (eg conf gen vs. simulation setup?)

          • JW – 90% of users don’t have OE, so changing the precedence won’t change the results?

        • DD – Any trouble with the way we’re using AmberTools re: license?

          • JW – Some background on AT licensing is

          • SB – In the AT license file, it doesn’t indicate what sqm is. I think this means it’s something like GPL3. libsander is LGPL.

          • SB – AmberTools is open enough that we don’t have to treat their implementation like a black box, which is much better than OE

  • MT

    • Focused on interoperability. Wrote a LAMMPS exporter. Getting matching energies (vs OpenMM) for small molecules.

      • This is helpful because LAMMPS gives a lot of control over electrostatics handling. OpenMM doesn’t let us do this, which makes it hard to validate implementations.

    • Making all OFF System exporters also include energy evaluation functionality.

    • Made partial reparameterization example – eg, parameterize with one FF, then replace the torsion terms with those from another FF. (It’s hard to find a meaningfully different pair of SMIRNOFF FFs right now, but in the future this could be more interesting)

      • SB – Two potential use cases

        • 1) Swap all eg. torsion terms

        • 2) Take parts of a system and parameterize them with different FFs.

      • MT – Agree. The notebook that I shared earlier focused on case 1). In my head I refer to Point 2) as “system combination”. One could think about an example of point 2 as “having some molecules be GAFF and others be Parsley”.

      • SB – I’m mostly interested in point 2 – Something like doing the same protein with a ligand parameterized in different FFs.

      • MT – This should be doable. I could envision some complexity around covalent bonds between molecules parameterized with different FFs

        • SB – Agree.

        • JW – This seems like a complex case that we haven’t considered in our planning yet. But I could see this being somewhat common among people trying to do interesting science.

        • MT – Agree that this will be hard.

    • Made a big refactor to some core classes in System. Now, eg, Bonds have two particle indices and a pointer to a potential. Previously these were magic strings that only I understand, but now they’re a more approachable class.

    • Talked about how to support atom-types FFs. One big question is whether atom types go on Topologies, or if they just exist in the assigned parameters.

    • Lots of work on VU collaboration,

      • built a very important component of the shared roadmap (OFF Mol → mBuild mol) in a work session w/ JW.

      • big meeting on Friday.

    • Preparing to work with Jason Swails and find areas of overlapping effort. Need to resolve residue template functionality (ParmEd has it, OFF system doesn’t and shouldn’t). We have our second meeting scheduled tomorrow, I’m optmistic that our goals overlap.

  • DD

    • Worked with Bill Swope on iodine benchmarking issues. He pointed out that many calculations report success, but return clearly bad geometries/energies. Pavan and Trevor have done a lot of good digging into this problem. We’ll need to makea choice on how we handle this – could do different settings for iodine containing molecules, unconditional use of a more complex basis set/settings

    • Working on benchmarking. Many partners ahve finished already. Need to decide whether to do a “season 2” of benchmarking. Would include torsiondrives (including w/ ANI) and several other pipeline upgrades

      • SB – Could you elaborate?

      • DD – Most partners in the benchmark aren’t running QC jobs using a QCF server. Instead they’re using the “optimize execute” pathway, which calls QCEngine directly in batch jobs. However torsiondrives are “services” on the server-side, which require some compute activity on the server itself, which requires a persistent server. Most of the pharma partners are using either no server or a snowflake. So I’d like to make a special process that can do a torsiondrive locally.

      • SB – Sounds good. How much code duplication does this lead to?

      • DD – It’s reasonably direct. Surprisingly it’s more complex to manage the server-based approach. For torsiondrives specifically, there will need to be some duplication of code from QCF, but it’s just code that calls the TorsionDrive packagem and it’s not well-localized in QCFractal.

      • SB – Sounds good. Ideally, this will generalize to the bespoke pathway (since we want to do torsiondrives without setting up a whole server). Where does the code live?

      • DD – Currently openff-benchmark

      • SB – It’d be good to have this live somewhere more permanent in the long run. I’m excited to see the functionality that’s coming out of the benchmarking effort.

    • Worked on Swope’s needs for QCA molecule exporting. We added a local hack to help him move forward in his own work.

    • In QCA, I made some improvement to our qca dataset management infrastructure. Can now change dataset priority using GH labels. These get updated every 8 hours, which handles some tricky issues related to torsiondrives.

      • In the future, this will also help with high-resource jobs like PEPCONF. So we can route big jobs into a high memory queue.

    • Meeting with Chodera to take over management of QC workers on Lilac (MSKCC cluster).

    • Working on other user experience areas on QCA dataset submission and implementation of standards V3.

  • JW

    • Onboarded Lily Wang (polymers) and Lorenzo D’Amore (new Janssen postdoc)

    • VU coordination W/ Matt

    • 0.8.4+0.9.1 release prep (still outstanding question about one test)

    • Formed SMIRNOFF steering committee to help us make any decisions (or adopt a system of making decisions)

      • SB – Could include Josh Mitchell in this planning process.

      • JW – That’s a good idea. I’ll ask Josh Mitchell to pick some time where he can meet synchronously with folks outside of CA and Austrailia

  • PB

    • Mostly WBO work – Analyzed a fit proposed by JMaat

    • Gave presentation for most of FF release meeting. Got lots of feedback during talk.

      • Will make RMSD check for torsiondrive fitting

      • Will re-evaluate steric/nonbonded energy cutoff

      • Will look into using ELF10 WBOs – Currently getting these from development build of OFF toolkit

    • Question: I’m looking to use hessian fitting instead of vibfreq fitting. Is this production-ready?

      • SB – HJ is still working on this and checking into the best way to set different hyperparameters. It’s not production ready yet.

    • JW – Are available forcebalance conda packages still on the old namespace?

      • SB – Yes, there’s a GitHub release with the new namespace, but no conda package yet.

  • SB – Packages that aren’t updated for the namespace change

    • QCSubmit

      • SB – Not sure what the blocker here is

      • PB – I see it importing openff.toolkit in some places

      • (General) – No released qcsubmit packages use openff-toolkit, they’re all openforcefield

      • DD – Which openff toolkit should this use?

        • SB – openff-toolkit-base

      • JW – The real blocker may be a new forcebalance release, since that’s required by bespokefit

    • Evaluator

      • SB – There’s a bug in the new version re: openMM. I don’t want to make a new release until I know what’s happening here.

      • SB – Peter never responded to the issue I opened – Could someone else ping it?

        • DD will reproduce this locally and ping the issue to ask Peter to respond.

Sprint Planning

Will return for sprint planning at 29 past the hour

Action items

Decisions