/
2020-12-04 QCA Submission Meeting notes

2020-12-04 QCA Submission Meeting notes

Date

Dec 4, 2020

Participants

  • @David Dotson

  • @Pavan Behara

  • @Trevor Gokey

  • @Joshua Horton

  • Ben Pritchard

Goals

  • New advancements

    • Partial submission handling in QCSubmit:

    • Local optimization debug script added:

  • New submissions

  • Compute bottlenecks

    • Pursuing more compute from AWS Spot.

  • Upcoming infrastructure improvements

    • STANDARDS-based versioning #137

    • Dataset index on qca-dataset submission #147

    • Do we need to change the index system we use for molecule submissions?

    • More targeted error cycling; what else do we need in the report for decision-making?

  • Upcoming science support

    • Enforced c1 symmetry in psi4 is almost ready

    • Visualizing the density in QCArchive

  • Larger advances

    • Automated FF coverage gap identification, torsion prioritization, submission generation

    • Benchmarking (dashboard, etc.)

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Submission handling for slow compute spec adding

David

  • JH: will add thread-pool submission of compute specs internally to QCSubmit

Local optimizations

Trevor

  • TG: added local optimization script as a PR to QCSubmit

  • DD: I’ll review! This is an important debug pathway for users we’ll want to support

UCI compute

Trevor

  • Need to switch to launching one manager per job, can’t run manager on head node (too much memory use, admins don’t like it)

  • HPC3 now has an openforcefield user that uses a guest QOS (pre-empt with core limit)

New datasets

 

  • DD: will review Genentech dataset

  • PB: Protomers/tautomers reviewed, ready for merge

  • PB: working on another Genentech set; getting an error I’m not sure about

    • in putting together submission, separating out molecules with rotors greater than 3 for fragmentation; is this a reasonable approach here?

    • JH: Yes, that should work well; there’s also a module/filter for this in QCSubmit

      • might not have all the nuance needed; makes sense to create an issue on QCSubmit

    • PB: Silicon Therapeutics dataset, doesn’t have CMILES

      • JH: Will need to use OpenEye locally to generate; predates existing process

  • JH: ANI1x, not really any demand at this time, and so may not make sense to submit

    • DD: agreed; we’ll let it sit, can always submit later if demand resurges

  • TG: Hessians/wfn; Tobias wants to do chemical perception stuff, so there is demand for this now

    • Does QCSubmit do Hessians?

      • JH: Results class in QCSubmit that can pull out existing dataset, make a new one, add Hessian spec, make the submission; probably will take a while

      • JH: Will point you to a test that does something like this, can give you assistance in preparing

    • JH: How much of the wavefunction does Tobias need?

      • TG: He says he needs the full wavefunction, but we’ll see if what we can give him is enough to work with.

      • JH: I think the other bits that were broken in psi4 are fixed; can now get everything instead of just “orbitals and eigenvalues”; may be worth waiting a little bit and try to get “all”

      • TG: This is exploratory, so willing to go with what we have now? But definitely interested if we can get “all” working

        • Just need to see if there’s a release since then; do local testing

  • DD: Compute expansion to PEPCONF WB37X/6-31G*

    • TG: Will be expensive, may want to route to its own compute tag

      • DD: We’ll want this run at “normal” priority, and consider a dedicated compute tag for John to control

  • BP: Can stand up a manager on local VT clusters, use the scavenger queue (pre-emptible, low priority); could add in openff compute tag below tags used by MolSSI to compute jobs

    • Can give manager multiple compute tags; listed in order of priority

    • Opens up possibilities for advanced routing of tasks

STANDARDSv3

Trevor

  • Seeded the approach, got a lot of good feedback, but now need to resolve

    • Will review feedback, resolve points of contention

    • DD: we’ll hold final discussion and a laying on of hands at next week’s call

  • PB: how much work will this put on submitters after adoption?

    • TG: the standards largely capture practices we are already doing or want to do; the intention is that these standards define the desired behavior of e.g. QCSubmit, so that users simply need to use QCSubmit to comply with most of the STANDARDS

  • TG: Do we adopt then implement? Or implement then adopt?

    • JH: I think once it’s adopted, can move quickly to ensure it has clear implementation

    • [decision] we’ll adopt, then implement quickly after

Error cycling nuance

David

  • DD: What kind of nuance should we add to error cycling? Limited number of retries? Certain error types we don’t restart?

    • JH: Sounds like limited number of retries makes sense to handle at the manager or server level

    • TG: We are seeing retries for task result sends in the manager; could use a similar mechanism for this

    • DD: will create an issue for retries at the manager level; wouldn’t require task specs to know anything about retries; could eliminate need for much error cycling

  • TG: Can we use psi4’s native optimizer instead of geomeTRIC?

    • BP: not yet, but this is aspirational

    • Not yet possible to slot in other optimizers, and may not be for a while

Enforced c1 symmetry

Josh

  • This is only an issue for old datasets, since we handle specifying this on submission for new ones

    • This looks like it will be merged really soon!

Wavefunction visualizer

Josh

  • fortecubeview?

  • DD: could make a place for this in the QCFractal user interface; would be worth experimenting with this and perhaps other visualizer candidates

  • JH: Tried giving it a play and loading the wavefunction data; did hit some segfaults, though

Action items

@Joshua Horton will add thread-pool submission of compute specs internally to QCSubmit
@David Dotson will review Trevor’s local Optimization executor (PR on QCSubmit) for user debugging
@Trevor Gokey will switch to launching one manager per job, similar to Lilac, PRP, on UCI compute resources
@David Dotson will review Pavan’s Genentech dataset submission
@Joshua Horton will point Trevor Gokey to a test/example that pulls out an existing dataset as a Results class, makes a new one with Hessian spec ready for submission
@David Dotson will submit PEPCONF WB37X/6-31G*; coordinate with John on execution and conclusions
Ben Pritchard will stand up a manager on local VT clusters, use the scavenger queue, add in openff compute tag below tags used by MolSSI
@Trevor Gokey will address/resolve feedback on STANDARDSv3; prepare for final discussion and adoption on 12/11
@David Dotson will create an issue for task retries at the manager level; make configurable in manager config; assess changes to error cycling given in-manager retries
@David Dotson will create an issue on QCFractal for visualizer candidates as part of QCPortal; assess fortecubeview

Decisions

  1. After adoption of STANDARDSv3, we will move quickly to implement in QCSubmit, deploy in validation/lifecycle on qca-dataset-submission