2021-08-30 Benchmarking workshop prep meeting notes

Participants

  • @David Hahn

  • @Lorenzo D'Amore

  • @David Mobley

  • @Jeffrey Wagner

Goals

  •  

Discussion topics

Item

Notes

Item

Notes

Summary

  • Notebook:

    • LD – Worked last week with DD, DH, DM to prepare the following notebook

    • LD – Identified a weak torsion in openff-1.3.0. This notebook shows how to use the public dataset to find the bad torsion by counting violations.

    • LD – 2 fill-in-the-blanks for users to participate. Will give them 5 min break each

  • Schedule

    • Slides (< 1 hour)

      • LD – Short summary

      • DH – Protein-ligand benchmarking (up to 25 minutes, including some time for feedback)

      • LD – Small molecule benchmarking (up to 25 minutes, including some time for feedback)

    • Live notebook (rest of time)

      • LD – Live notebook

  • LD – Slides are 90% done. Preview: https://docs.google.com/presentation/u/1/d/1Vdda7wZubSVAra3Gum9X6ih_MUNyHqtwKzLaVM0UPiY/edit#slide=id.p1

  • DH – I’m more like 80%. I’m planning to focus on the differences between OpenFF versions.

  • LD – Had a question for SB - The 4th workshop results plot was re-scaled to be more slide-friendly. Is this code available?

    • DM + JW – This was done by simon manually. I don’t know whether the code is available. Should wait to hear from SB about whether his code is available

    • DM – Could make “slide friendly plots” into a new option in openff-benchmark

  • JW and DM will review LD (and possibly DH) slides today/tomorrow

  • DH – Important things to mention about P-L benchmarking during the workshop?

    • JW – People may ask how the Open free energy consortium will change how we do benchmarking. We should say “OpenFE doesn’t exist yet, and their software won’t be available for 6+ months. So we’ll coordinate with them to ensure that we meet each others' needs.”

    • DH – A lot of people in pharma are wondering how to get free energy calcs running using OpenFF force fields. Right now there’s no open way, have to use Cresset or OE, and eventually OpenFE will offer free solutions.

    • JW – People may ask how to modify your code to run on their data. How should we communicate the risks/difficulties ofthis?

      • DM – Also, we should mention that your workflow expects “prepared” files, and that there’s not a real “open” solution for that. May be good to prepare a hidden slide for steps in the process that are missing “industry quality” open software. This could guide what people ask for from OpenFE.

    • JW – Future plans? Larger or more diverse dataset?

      • DH – This could be a good discussion topic. Even in the existing datasets, there are some perturbations that are “bad” for various reasons.

        • JW + DM – This could be a great slide topic “What is good benchmarking data and what is bad benchmarking data?” - could guide future dataset releases.

  • DH – Would be interested to hear about other benchmarks people would want to run, whether it’s small molecule or solution. And what other tools/methods they find useful.

    • (General) – Could show a slide of “here’s future ideas for benchmarking from what you’ve asked before. Which of this is most important? Is there anything missing? Would anyone want to start one of these studies?”

DH slide prep

  • JW – Dataset names are a bit confusing, could paste over them on the slides with something standard.

  • JW – Let’s be sure to say that all the data is preliminary, since it’s new analysis.

Technical needs

  • JW will make a single-file installer and make a release of the repo and attach single-file installers based on the conda env.

Action items

Decisions