2021-02-03 Benchmarking for Industry Parnters - Development Meeting notes

Date

Feb 3, 2021

Participants

@David Dotson
@Jeffrey Wagner
@David Hahn
@Joshua Horton

Goals

Updates from project team members
Clear for release and production run start?
Identify and address project risks

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
Updates		JW: last week, helped make initial release for industry consumption for burn-in; have both conda packages and single-file installers the release process feels a bit circular, environment file reflects `master`, so currently update it with version pin of `openff-benchmark` after release also did a bunch of cleanups; changed inter-conformer RMS; merged Josh’s coverage reporter PR (very impressive work!) DH: testing at Janssen with burn-in dataset; switched to option C for optimizations, fewer filesystem timeouts it’s a workstation queue, so have to be careful with how we use resources got some complaints with memory usage for option B; switching to option C gives no complaints so far additionally looking into OPLS3e optimization protocol might be a bit of work to integrate it into workflow run FF builder, then run minimization would be nice if we had some default params so it’s a fair comparison JW: you might have to budget two solid weeks of your time to making sure errors are handled well, outputs within expectation, etc. DD: can we call the schrodinger tooling through `subprocess`, feed in input structure, extract initial energy, final energy, final molecule from outputs, then put out an SDF as we do for the primary optimization path? DH: should be able to do this JH: finished up coverage report, Jeff merged JW: we also have your PR for parallel conformer generation; tried to pick back up, merge diff is too mean not sure how to do sane exception handling for the process pool approach JH: the function being called by the process pool needs to catch all exceptions to work smoothly; have to rely on reporting via logs for errors DD: I worked with Jeff to react to burn-in feedback have resolution on coverage report stratification; will do no stratification, partners can share more info at their discretion JW: do we want to provide guidance on performance? if they put in something with many rotatable bonds, could get a ton of conformers JH: n^4 electron scaling JW: perhaps for guidance (relative molecule size)^4 * (relative number of molecules) * (10 confomers (upper bound)) relative == (prod/burnin) JW: should we set a planned end date? Light a fire under slower partners, not punish fast partners or ones really trying with hairy infrastructure DH: will ask Gary what his expectations for completion date are; could help us to drive progress
Clear for release and production run start?	David	DD: my next several hours are devoted to release, protocol coverage report update, issuing instructions to partners could use help filling in the protocol bit for coverage report we want circling back to assist partners DH: think we’re good to go JH: once you make that release, are you planning to run the public compounds as well? yes, this is the plan; do we want to use the same exact protocol with `openff-benchmark`? JH: sounds good, yes I would use the same protocol as the partners are
Project risks		DH: schrodinger minimization tool starts up a python script should we use this directly? DD: might be a bit fragile to import from that script and use components; I’m inclined to still wrap the executables, since this is the published UI for the tooling and is likely the more reliable pathway DH: need to check if we can publish code that wraps schrodinger CLI calls; asking Gary about license terms also need to ask Gary about license terms around publishing benchmark results from OPLS

Action items

@David Hahn will work on implementing an OPLS3e optimization component that consumes inputs, produces outputs as our existing openff-benchmark optimize execute entrypoint does, wraps Schrodinger tooling

@David Dotson will cut release, send announcement to partners kicking off production benchmarking runs

@David Dotson will prepare the public datasets provided by each partner using same openff-benchmark Season 1 protocol

@David Hahn will get proposed target date for benchmark results, license terms from Schrodinger on tool wrapping, from @Gary Tresadern

Meetings