2020-08-14 QCA Submission Meeting notes

Date

Aug 14, 2020

Participants

  • @David Dotson

  • @Trevor Gokey

  • @Joshua Horton

Goals

  • Plan of attack for MM calculation failures

  • Deployment instructions for OpenFF managers on qca-dataset-submission

    • conda config --set channel_priority cannot be “strict” for the anaconda hosted spec (must be “flexible”)

  • Lifecycle handling for compute.json.

  • New datasets

    • Disacharride submission

    • Protein Fragments Dataset V2

    • Add additional compute to OpenFF-Benchmark-Ligands

  • QCSubmit: implement a toplevel qcsubmit.submit(“dataset.json”, client=None)?

    • submit(“dataset.json”, “spec-default.json”, client=None)?

    • submit(“dataset.json”, “spec-openff-1.0.0.json”, client=None)?

    • MM specs are inserting/using psi4 options

    • Using unconstrained openff versions / prevent constrained?

      • geomeTRIC does not respect system.xml <Constraint> elements (natively)

  • GHA restarting not working on non-default specs?

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

QCA conda deployment

 

  • Pinning MM dependencies

  • Create a separate prod env on the basis of harness we are running to avoid conda dependency conflicts between programs

    • OpenMM (off-tk and friends)

    • Psi4 (no off-tk?)

    • ANI (no off-tk?)

  • Also, can we put a manager config somewhere (with creds substitutible)?

    • Perhaps write a script that clears old env, installs new env, pulls manager template, injects credentials?

lifecycle handling for compute.json

Josh

  • should we decouple compute from molecule submission?

    • what’s the convention we could use?

    • would give us flexibility on adding compute after a submission

  • David: given usage pattern we want for submission, need to then think how we can make automation handle this deterministically.

    • will report back on #131 proposed path forward for automation support

GHA restarting

Trevor

  • Restarts perhaps not working for TorsionDrives based on #119; would expect to see ERRORs turn back into INCOMPLETEs, but we may need to wait until the queue is less full of running TDs

  • David: error cycling last night failed because the comments were too big. Added truncation if error tracebacks more than 60,000 chars collectively

New datasets

David

  • Disacharride submission

    • Josh: need Cerutti to add molecules

    • David: I will ping

  • Protein Fragments Dataset V2

    • Josh: need Cerutti to answer clarification question; then ready to go

    • David: I will ping

  • Add additional compute to OpenFF-Benchmark-Ligands

    • David: will chew on this approach of decoupling the molecule submission from the compute; need to make sure we can automate what we decide on doing

    • David: will report back on that PR proposed approach

QCSubmit questions

Trevor

  • QCSubmit might benefit from some minimal usage docs and what programs, basis, methods it supports for specs

    • This is not necessarily straightforward, since QCSubmit relies reactively on what the tools it calls can support

Action items

@David Dotson will submit a PR including latest qcarchive-worker-openff.yaml changes. Will include a separate env file for each QCEngine harness we use.
@David Dotson will create a PR for manager configs we are using on various resources. Will include instructions for straightforward credential injection.
@David Dotson will write up proposed path forward on #119 given automation concerns.
@Joshua Horton will consider an approach for minimal docs on QCSubmit submission usage
@David Dotson will ping @David Cerutti (Deactivated) for remaining action items on #124 and #129.

Decisions