2020-07-10 QCArchive Submission Meeting notes

Date

Jul 10, 2020

Participants

  • @David Dotson

  • @Trevor Gokey

  • @Joshua Horton

  • Ben Pritchard

Goals

  • Submit any ready datasets in the PR queue for qca-dataset-submission

  • Address any issues with datasets not ready for submission

  • Discuss and establish paths forward on any issues in our process

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Rowley Biaryl torsiondrive set status

Josh, Trevor

  • Now have a conda package for QCSubmit

    • not sure how add basis set exchange, which is only a pipinstallable dependency

  • Moved all validators to run when you load up the dataset

  • Working on CI that will run validators on PR

  • More datasets coming through from Dave Cerutti

    • not the same kinds of things we’ve gotten before

    • weird constrained optimizations

    • will be quite a few of these coming through

  • @David Dotson and @Joshua Horton will meet up to get validation CI up and running today or next week

Remediation pathway for INCOMPLETE results

Ben

  • Site review went well; however, next three weeks will not be as available

  • The issue with old jobs not being complete has been fixed; multiple issues

    • psi4 issue - fixed

    • restart of old jobs didn’t handle schema properly; job would never complete, would perhaps rerun all the time - fixed (in qcfractal server)

Root cause analysis for other database consistency issues

Ben

  • database inconsistency where there’s an incomplete result, but no associated task - requires a root cause fix and a remediation on public database

Psi4 release?

Ben

  • no new release, but the psi4 fix for hanging file descriptors is in a dev release

    • Ben is checking with Lori as to whether we can get a new release, or if not, any substantial difference between release and dev versions

Error cycling and lifcycle automation

David

  • Presented qca-dataset-submission automation

  • Trevor: grid optimization dataset may throw a wrench in things

    • like a lightweight torsiondrive

  • Trevor: won’t want to do restarts on SCF convergence errors

    • we should add handling to the errors such that we don’t restart these, call them out in error cycle report

  • Ben will look into creating an openff-bot account on QCArchive for automated submission

  • @David Dotson will fill in PRs for legacy datasets

Protein dataset

 

@Trevor Gokey will work with @Joshua Horton to build out validators for the Initial Protein dataset v1.0

Action items

@Joshua Horton will work to add a validation.yml Github Actions workflow to qca-dataset-submission. @David Dotson will assist next week.
@Trevor Gokey will work with @Joshua Horton to build out validators for the Initial Protein dataset v1.0 #109.
Ben Pritchard will add an account for dataset submission for @Trevor Gokey, openff-bot to public QCArchive.
Ben Pritchard will ask Lori Burns if a new psi4 release can be made (last one is a year old), or if a dev release is a safe bet (though not preferred)
Ben Pritchard will continue to root cause and develop remediation pathway for INCOMPLETE tasks
Ben Pritchard will push toward a new QCFractal release
Ben Pritchard will work to add a conda package for basis-set-exchange (needed for QCSubmit)
@David Dotson will work with Ben on clearing the above tasks

Decisions