2021-02-03 Benchmarking for Industry Parnters - Development Meeting notes

Date

Feb 3, 2021

Participants

  • @David Dotson

  • @Jeffrey Wagner

  • @David Hahn

  • @Joshua Horton

Goals

  • Updates from project team members

  • Clear for release and production run start?

  • Identify and address project risks

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Updates

 

  • JW: last week, helped make initial release for industry consumption for burn-in; have both conda packages and single-file installers

    • the release process feels a bit circular, environment file reflects master, so currently update it with version pin of openff-benchmark after release

    • also did a bunch of cleanups; changed inter-conformer RMS; merged Josh’s coverage reporter PR (very impressive work!)

  • DH: testing at Janssen with burn-in dataset; switched to option C for optimizations, fewer filesystem timeouts

    • it’s a workstation queue, so have to be careful with how we use resources

    • got some complaints with memory usage for option B; switching to option C gives no complaints so far

    • additionally looking into OPLS3e optimization protocol

      • might be a bit of work to integrate it into workflow

      • run FF builder, then run minimization

      • would be nice if we had some default params so it’s a fair comparison

    • JW: you might have to budget two solid weeks of your time to making sure errors are handled well, outputs within expectation, etc.

    • DD: can we call the schrodinger tooling through subprocess, feed in input structure, extract initial energy, final energy, final molecule from outputs, then put out an SDF as we do for the primary optimization path?

      • DH: should be able to do this

  • JH: finished up coverage report, Jeff merged

    • JW: we also have your PR for parallel conformer generation; tried to pick back up, merge diff is too mean

      • not sure how to do sane exception handling for the process pool approach

      • JH: the function being called by the process pool needs to catch all exceptions to work smoothly; have to rely on reporting via logs for errors

  • DD: I worked with Jeff to react to burn-in feedback

    • have resolution on coverage report stratification; will do no stratification, partners can share more info at their discretion

    • JW: do we want to provide guidance on performance?

      • if they put in something with many rotatable bonds, could get a ton of conformers

      • JH: n^4 electron scaling

      • JW: perhaps for guidance

        • (relative molecule size)^4 * (relative number of molecules) * (10 confomers (upper bound))

          • relative == (prod/burnin)

    • JW: should we set a planned end date? Light a fire under slower partners, not punish fast partners or ones really trying with hairy infrastructure

      • DH: will ask Gary what his expectations for completion date are; could help us to drive progress

Clear for release and production run start?

David

  • DD: my next several hours are devoted to release, protocol coverage report update, issuing instructions to partners

    • could use help filling in the protocol bit for coverage report we want

    • circling back to assist partners

  • DH: think we’re good to go

  • JH: once you make that release, are you planning to run the public compounds as well?

    • yes, this is the plan; do we want to use the same exact protocol with openff-benchmark?

    • JH: sounds good, yes I would use the same protocol as the partners are

Project risks

 

  • DH: schrodinger minimization tool starts up a python script

    • should we use this directly?

    • DD: might be a bit fragile to import from that script and use components; I’m inclined to still wrap the executables, since this is the published UI for the tooling and is likely the more reliable pathway

  • DH: need to check if we can publish code that wraps schrodinger CLI calls; asking Gary about license terms

    • also need to ask Gary about license terms around publishing benchmark results from OPLS

Action items

@David Hahn will work on implementing an OPLS3e optimization component that consumes inputs, produces outputs as our existing openff-benchmark optimize execute entrypoint does, wraps Schrodinger tooling
@David Dotson will cut release, send announcement to partners kicking off production benchmarking runs
@David Dotson will prepare the public datasets provided by each partner using same openff-benchmark Season 1 protocol
@David Hahn will get proposed target date for benchmark results, license terms from Schrodinger on tool wrapping, from @Gary Tresadern

Decisions