2020-11-20 QCA Submission Meeting notes

Date

Nov 20, 2020

Participants

  • @David Dotson

  • @Simon Boothroyd

  • Ben Pritchard

  • @Trevor Gokey

  • @Pavan Behara

  • @Joshua Horton

Goals

  • New advancements

    • PCM-based implicit solvent pathway

  • New submissions

  • Compute bottlenecks

    • Do we need to pursue more compute?

    • Open discussion on strategies for meeting user timelines, managing expectations.

  • Upcoming infrastructure improvements

    • STANDARDS-based versioning #137

    • Dataset index on qca-dataset submission #147

    • Local Optimization executor

    • Do we need to change the index system we use for molecule submissions?

    • More targeted error cycling; what else do we need in the report for decision-making?

  • Upcoming science support

    • Enforced c1 symmetry in psi4 is almost ready

  • Larger advances

    • Automated FF coverage gap identification, torsion prioritization, submission generation

    • Benchmarking (dashboard, etc.)

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

PCM-based implicit solvent

Simon

  • PCM appears to be working on the COH submisson

  • JH: Also first dataset storing wavefunctions/eigenvalues, so another first

  • SB: storage and retrieval working just fine!

  • DD: would be worth showing this off at next show-and-tell; I’ll find out from Jeff

Submissions



  • SB: COH is about 50% complete

    • don’t have error cycling in place for basic DataSets yet; will get today

  • PB: genentech optimization; working on first submission. Only 20% of dataset would be submitted in this first run

    • is this acceptable?

    • Yes, we’ll proceed with the smaller, 127 molecule (20%) subset for the first submission

    • DD: feel free to reach out to me when desired; we’ll re-roll the PR off of master (DD messed up long-lived branches with squash merges)

  • PB: protomers/tautomers

    • JH: fewer tasks than there are conformers; due to QCF index not being case sensitive, and some of the SMILES clash when reduced to lowercase

      • Do we have another solution? Do we drop the use of SMILES for the index?

      • TG: For torsiondrives, this is still useful.

      • JH: Still want to be able to group molecules that are just peer conformers

    • JH: change how we index molecules, just do molecule-0, conformer-0, basically avoid SMILES for OptimizationDatasets, Basic DataSets; keep SMILES as index on TorsionDrives

    • TG: May still run into issues on this with TorsionDrives, but like this because we tag the driven torsion

    • DD+BP: Could also go with removing the lowercase-casting on indices; would be almost a trivial change, and non-destructive for database access (we’ll pursue this)

      • Issue raised:

  • DD: PEPCONF

    • We’re getting some user pressure; why is it proceeding slowly?

    • Decide on a rebalancing of priorities for datasets:

      • reduce priority to low for some optimization sets

    • TG: Many of these molecules will take a lot of memory > 50GiB

    • DD: Perhaps time to scale up all our nodes to a minimum amount of memory for QM jobs

      • Do we know if there are ways to reduce the memory usage of Optimizations?

      • BP: Psi4 can write to disk if needed when memory gets constrained

      • DD: I will reduce the memory offered to the manager to below the constraints given to each worker; may trigger writing to local storage

        • also increase the total memory of each replica to 64GiB

        • Could also scale the CPUs to 32, perhaps even 64

    • We’ll increase the priority of PEPCONF to high

    • TG: will reduce number of workers deployed, see if this reduces pre-emption frequency

  • Phenyl Dataset - will start to starve others

    • DD: I’ll touch base with Jessica, find out timeline needs for Phenyl set

Strategies for user timelines, expectations

David

  • JH: I think we can be faster in merging datasets now, especially with STANDARDS coming into place

  • DD: we’re already defaulting to ‘high’ priority for fitting datasets, more discretionary for others

  • JH: Some of the datasets were from PI pressure to get things running; could be re-tagged to ‘low’ priority

  • DD: compute tags are an avenue for controlling flows, but dangerous if we park tasks in a compute tag for which we have no managers

Dataset index

Josh

  • Probably good to merge; can’t find the script used to generate

  • DD: we can merge and manually curate for now, add automation later

Error Cycling

David

  • TG: Restarts of SCF convergence, optimization convergence appear to clear often enough, probably don’t want to exclude these

    • High memory for psi4 can be dealt with through better configuration of workers (setting memory available to less than memory allocated on the node)

  • DD: We’ll close for now; can chew on more ways to utilize compute tags for routing, how we want to filter error cycling

Enforced C1 symmetry

Josh

  • C1 symmetry is coming in Psi4, old datasets where we didn’t do this will still work

    • if method requires a specific symmetry, psi4 will set it itself

Action items

@David Dotson will get next show-and-tell date from @Jeffrey Wagner, relay to group for PCM, wavefunction demonstration
@David Dotson will add in error cycling for basic DataSets to lifecycle
@Pavan Behara will proceed with Genentech dataset, with initial submission only including smaller molecules (~20% of the full dataset); reach out to @David Dotson for help fixing the branch/PR when ready
@Trevor Gokey will experiment with reducing the number of workers deployed on pre-emptible queues, see if this positively impacts pre-empt frequency; potentially reach out to admins for assistance
@David Dotson will re-work PRP deployment of QM workers with manager limits below those given to the container; use fewer CPUs, more memory per replica, more replicas
@David Dotson will touch base with Jessica Maat on timeline needs for Phenyl set; assess priority of other sets relative to it
@David Dotson will review and merge the index on qca-dataset-submission; create issue for automated curation
@David Dotson will tag the PEPCONF dataset with priority “high”
@David Dotson will add compute tagging on the basis of submission priority-* GH tag to lifecycle error cycling

 

Decisions