2020-12-14 Core Devs Meeting notes

Date

Dec 14, 2020

Participants

@Jeffrey Wagner

  • @David Dotson

  • @David Hahn

  • @Pavan Behara

  • @Matt Thompson

  • @Simon Boothroyd

Discussion topics

Item

Notes

Item

Notes

Roundtable updates

  • DH –

    • Last week, did lots of benchmarking work. Worked on CLI for benchmarking analysis and integrating whole pipeline.

    • Worked on benchmarking best practices manuscript.

    • Will begin first run of pharma partner benchmark.

    • Worked on PLBenchmarks repo. Decided that git LFS will let me keep API and data in same repo, which is nice. Will work with MT on release process.

    • Still considering whether to change PLBenchmarks repo name. We had talked about it but there wasn’t much feedback. I’ll take the current possibilities and open a poll on Slack.

  • SB –

    • Mostly scientific studies. Lots of BCC refitting, some work with ML models to predict AM1 partial charges and BCC.

    • Some changes to recharge API to enable better integration.

    • Some studying about best practices in RESTful APIs. Conclusion seems to be that there aren’t best practices.

  • MT –

    • Pushing on many tasks. Switched openff-system over to use implcit units while unit discussion is underway. Working on getting system to export to openmm.

    • Can now get an entire parameterized system exported to dictionary. Need OpenFF Toolkit Topology to have working to_dict and am working on a PR for that.

    • Helped test c-f OpenMM package.

    • Tested out unit models, tried idea discussed in previous unit meetings. It seems like the wiring-up-different-public-and-private-attributes/setters may be too difficult to meaningfully implement.

      • JW – Another meeting on that this week? Scheduled for Thursday 9 - 10:30 AM Pacific / 11-12:30 central / 5 - 6:30 PM GMT

    • MT – Not taking much time off for holidays. Will be working before Christmas and between christmas and new years.

  • DD –

    • Lots of work on benchmarking. Worked closely with JW, DH, and JH. Goal was to get full-scale testing running this week.

    • Worked with Bill Swope on protocol document, testing, environment setup.

    • Developed pathways for persistent vs. nonpersistent server for benchmarking QCF server.

    • Expect to work with Thomas Fox, Bill Swope, David Hahn this week on benchmarking trial runs, documentation refinement.

    • This week, will meet with AWS spot team. Need to prepare written proposal in advance so they can prepare for technical discussion. Need to help them justify having them accept our project into the pool of groups that can use AWS resources for “the public good”.

      • JW – Need support on this? You seem to be the primary driver on this interaction

      • DD – May loop you in on review for documents/messages we send to them about our project.

      • SB – Is Spot intended to just be for QCF? Or would other OpenFF workflows be run on this?

        • DD – Would prefer to keep scope focused for now to improve our odds of getting included. Winning the ability to run QCF would get our foot in the door for subsequently running phys prop calculations/free energy workflows

        • SB – Agree

  • JW –

    • Worked with MT on cutting 0.8.1 release. Went pretty smoothly except that we hadn’t previously had tests involving mdtraj, which is needed for Topology.from_mdtraj

    • Worked a lot on benchmarking workflow. Found three major categories of RDKit stereochemistry issues:

      • Double bond stereochemistry is not set for some molecule loading pathways (fix in benchmarking-fixes branch

      • Chiral centers added by different kekulization of symmetric groups – Fluorene

      • Phosphorous

        • SB + DD – Should post this on RDKit issue tracker, even if they end up saying “our methods do what we document”

  • PB –

    • Worked on WBO interpolated parameters. Looking back on results based on SB’s comments during ff-release call. Doing another fit using correct FF file. Objective function is now below 1.3.0 but above the fully interpolated FF.

    • Also checking performance on Lim+Mobley benchmarking set, may need help from DH or others on getting it running again.

    • Working on OFFTK #720 (on roundtripping to/from qcschema). Still looking into problem statement and different options. Could use try/except statements to see whether the CMILES is present in various fields.

    • SB – How far along is automated benchmarking? Could that be used for this case?

      • JW – This would look like the last few steps of the benchmarking workflow, but with a custom OFFXML. The current workflow is a bit rigid, but we’ll need to expand the MM FFs to include s99F, so this may be a good opportunity to generalize the entry point

      • DD – This will need to work with non-released FFs, which require some finessing with QCEngine.

      • PB – It looks like benchmarkff can take a custom field for FF files

      • DD – For new benchmarking workflow (openff-benchmark), this field isn’t exposed. But we could do some work and figure out how to inject it.

      • PB – I’m able to pull data from QCA, but the MM minimization script is generating some errors.

      • JW – Let’s give this a try together – If benchmarkff can work with WBO with some small changes, then it should be possible to move forward with that.

      • SB – Would like to make openff-benchmark a good general-purpose benchmarking tool that we can use moving forward on subsequent releases.

      • DD – Agree. Though there’s some friction about the stability of the expected behavior and the customization desired by internal users.

      • JW – For the next few weeks, I’m opposed to adding complexity to the master branch of this repo, since this software absolutely needs to deploy and run on industry machines. Happy to have more complexity added on a branch, and eventually master in the future (once industry users verify that this can run)

      • DD – Agree

      • SB – Agree. Would like to ensure that the broader goals remain on the table in the long run though.

      • JW will work with PB to get becnhmarkff running again with more current OpenFF toolkit/OpenMM versions

Benchmarking needs

  • DD – Conda package

    • JW – Could build locally or copy toolkit’s noarch building actions

    • DD – Let’s copy the toolkit’s action to run on every update to master and push packages to omnia

    • JW –

      • How to ensure that latest packages/versions are recognized as latest? Git hashes won’t semantically sort necessarily

        • SB – Check out how psi4 does this? They do nightly/hash-based builds that sort correctly

      • main or other branch?

      • Do we need to worry about functionality changes/users recording versions in which each step was run?

        • DD – This is just for a trial run, so it’s OK if the versions get a little mixed up

        • JW – Agree

  • Decision: We’ll copy in the github action from OFFTK and upload packages to omnia:main

0.8.2 release?

  • JW – Purpose would be to get double bond stereochemistry fix out to industry benchmarking testers

  • MT – Don’t think there would be a problem with unusual release cadence.

  • DD – Agree

  • Decision: We’ll merge double bond stereo fix and cut 0.8.2 ASAP

C-F migration before holidays?

  • JW –

    • Con: Adds complexity and potential fires (10%ish chance) before people leave for holidays

    • Pro: JC wants to release openmm-forcefields on C-F, which requires OFFTK on conda-forge

    • MT – I’m up to wait. Very small chance of emergencies, but not worth the risk.

    • SB – Partially agree, but don’t see that much risk from this step.

    • JW – Not so concerned about new environments being broken, but rather deployment issues like having openff-toolkit and openforcefield in the same env

    • SB – Is there run-constrained defined in any recipes, to prevent both from being present in the same env?

    • MT – It doesn’t seem to be there. Though it is there for ambermini

    • SB – We should make a note to add run-constrained.

    • DD – Also concerned about deployment-related complexity

    • Decision: There’s not a pressing need for this before the holidays, so let’s not risk it.

C-F openmm size

  • JW – New openmm package requires between 400 MB and 1 GB of downloads on linux, largely due to hard dependency on cudatoolkit

    • General – Possible to unbundle CUDA from openmm?

    • JW – Eastman wants all openmm installs to “just work”, and the size penalty isn’t significant to him. NVidia has a legal/policy issue with unbundling cudatoolkit into just needed runtime libraries, and in fact its kinda extraordinary that they allow conda distribution at all

    • DD – The size issue shouldn’t be too significant in the context of something like a docker image

Sprint planning

  • Taking a break – Will return at 9:15 Pacific for spring planning

Action items

Decisions