2020-11-09 Sprint Planning Meeting notes

Date

Nov 9, 2020

Participants

  • @Jeffrey Wagner

  • @David Hahn

  • @Matt Thompson

  • @David Dotson

  • @Pavan Behara

Discussion topics

Item

Notes

Item

Notes

 

DH

  • Attended+presented at German cheminformatics conference. Need to analyze Vytas’s data.

  • Pharma benchmarking call.

  • Working on writing benchmarking paper.

MT

  • Wrapped up/finished lots of PRs/issues.

  • Units meeting – Figuring out goalposts and which package meets our needs

    • (General) – How can we convince VU to adopt Pint? What’s the cost of spending time/energy on this? What are the best and worst outcomes?

DD

JW

  • Worked on fixing Zenodo authorship – Webhooks are still tied to Chodera. Got login for info@openff and will move Zenodo hooks over that way.

  • Contacted conda-forge folks on gitter. Probably made us look like clowns.

  • Played with conf. gen CLI. Working on improving performance re: stereo and partial charges, output deduplication

  • Some back-and-forth on issues and PRs.

  • Started Toolkit Showcase example PR.

  • TG discussion – Directions for authorship. Need to talk with him and Mobley about how to get on papers.

    • DD – Spoke with him a few weeks ago about this. Agree that we should find a paper-destined direction for his efforts.

PB

  • Worked on WBOs and fitting. Some progress on separating trends based on chemical series. Iterating on some sets – will do Trevor’s next (currently working on Lim+Mobley mols)

  • Worked on preparing some datasets involving protomer/tautomer enumeration. JH is thinking about ligand expo set

  • Talked with Hyesu about getting involved in QM benchmark, expect to hear from her this week.

  • Q: How to filter benchmark molecules? Using RDKit fingerprint. I’d like to exclude benchmarking molecules from my testing set.

    • JW – Would recommend doing graph isomorphism checks. Should not compare bond order, formal charge, or stereochemistry.

 

DH – Thinking about PLBenchmark repo. Looked at doing a release, I find it overwhelming to keep track of the data – Should it go in GH release assets? Git tree? Zenodo assets?.

  • DH – Would like to be able to track the dataset changes between releases

  • DH – Currently datasets are in the 100MB → 1 GB range

  • DD – Would be good to make a policy document on this.

  • (General) – Goals:

    • Stable link/URL

    • Track changes – Not necessarily line by line, but see when a file changes or is added/deleted

    • Unlimited storage

    • Explorable in browser

  • (General) – Options

    • Github release assets

      • Does get a stable URL

      • Doesn’t track changes

    • Zenodo datasets

      • Does get a stable URL

      • Doesn’t track changes

    • Github git tree

      • Does track changes

      • Explorable in broser

      • 100GB storage

      • Stable-ish link

      • Could compress individual files, so diffs would still indicate when a file changed/was added/removed.

    • Git LFS?

      • MT – Looks liek ti compresses the contents and just displays hashes

    • Google drive link

Sprint planning

Begin at 8:45 Pacific

Action items

Decisions