2020-10-19 Developer Coffee Meeting notes

Date

Oct 19, 2020

Participants

  • @Jeffrey Wagner

  • @David Hahn

  • @Pavan Behara

  • @David Dotson

  • @Matt Thompson

Discussion topics

Notes

Notes

David Hahn

  • Merged big PR update to master. Thanks to MT for the review. Will cut initial release shortly.

  • Continued data analysis (PLBenchmarks repo). Some notebooks are still broken.

  • Met with DD and JH (pharma partners didn’t join).

  • Janssen asked me to test ReScale compute platform for QCFractal jobs, which is a GUI-based job dispatcher that can control multiple clusters.

Matt Thompson

  • Phase 1: Merged a ton of toolkit PRs. Also canary tests and openff-cli.

    • Thinking about process for testing + releasing conformer generation CLI. I tried it out earlier this week and found some pain points.

    • JW – Was thinking of announcing this at Wednesday advisory board meeting, but was also cautious about it not being tested.

    • (General) – We should recruit more testers for the script.

    • DH – It wouldn’t be horrible if it returned a bunch of redundant conformers, but it would be preferable if it didn’t. Documentation/being clear upfront could let us do either.

    • (General) – We could do a pruning at the end.

    • DD – Could expose max_conformers and rms_cutoff, and use those values both for the initial conformer generation AND for the final deduplication

    • DH – I recall that sometimes the RDKit conformer generation would hang indefintiely if I asked for too many conformers (eg 100 confs of methane). We should add logic to prevent this.

      • We will look for a reproducing case of this and open a bug report on OFFTK.

    • MT:

    • JW/DD – API stability of openff-cli

      • JW – We want it to be very stable since the versions aren’t readily apparent

      • DD – Since it’s a new package, we can’t guarantee stability, and it would be overly cumbersome to try and lock down the API immediately.

    • (General) – Do we want the benchmarking workflow to use the conformer generation script/machinery?

      • DD/DH – Will discuss further. We’ll want to make sure that it uses something similar “enough” to the Lim work.

  • Phase 2: Updating CMILES

    • MT – Looked into updating this. Not sure which code paths need to be maintained. Some RDKit tests failed due to version update. The 2020 RDKit release generated different CMILES than the old version. There was a script to update this.

      • One molecule, PO_2, was identified as having stereochemistry.

      • Not sure how to handle this sort of change in the future

      • JW – The only thing we could do to control this behavior is to pin CMILES to RDKit 2019 until the end of time. Which is such a bad solution that it’s probably not worth pursuing.

  • Phase 3: System/Interoperability

    • MT – Lots of meetings, writing, thinking about stuff. I don’t think we’ve yet figured out our to-dos.

    • MT – Not sure what the final goal is. Sometimes we talk about using a unified internal representation. Other times we talk about making a central converter/API for all of computational chemistry. Our org works differently enough from other orgs that we should be leading the charge, since we have experience making production code/packages. Not sure that the formality of our structure can work well with academic groups.

    • DD – Agree that engineering orgs and academic orgs have some trouble working together due to pacing difficulties.

    • MT – Cautious about depending on other team on timely deliverables, when neither one has formal power over/accountablility toward the other.

    • JW – I think a lot more discussion is justified, before we commit to any solution

    • MT – I think a shared internal representation is not fully feasible. I’d advocate for US building something great, that’s easy for other people to plug into. I’d advocate for US building up the OpenFF system, and letting Vanderbilt make output in that format. We can still use this as a concrete discussion point in subsequent discussions/meetings.

    • DD - How much do project goals between us and Vanderbilt overlap?

    • MT – At a high level, we agree on the contents. We disagree on some internal structure, like separation of physics and chemical topologies.

David Dotson

Jeffrey Wagner

  • Finished up N-1 CIMH.

  • Interoperability meetings/prep.

  • May have family stuff this week/half-day disruption.

  • Meeting with Anaconda CEO later today

  • Beginning planning for object refactor. Starting with “Slots” description.

  • Preparing to find+contract technical writer-ish person

  • JW will get Galileo one-pager

 

Pavan Behara

  • Worked with JM on WBO torsions analysis.

    • Encountered round-tripping/path-dependence issue when calculating WBOs for molecules when loaded directly into OE vs. through OFF toolkit.

  • Worked with DD on a few things.

  • Didn’t do a ton of molecule representation/reliability work.

    • JW – this is fine, it’s strictly less important than scientific goals

    • PB – yes, but did encounter some format-related issues in the science work.

 

  • 0.8.0 release

    • JW – Will cut today. Any objections?

 



Action items

Decisions