/
2021-02-12 QCA Submission Meeting notes

2021-02-12 QCA Submission Meeting notes

Date

Feb 12, 2021

Participants

  • @David Dotson

  • @Pavan Behara

  • @Joshua Horton

  • @Trevor Gokey

Goals

  • New advancements

    • Compute expansion rework - considering how to improve it

  • New submissions

    • Torsiondrive benchmark update

    • Genentech set2 - molcules that have more than three rotors fragment

  • Upcoming infrastructure improvements

    • STANDARDSv3 submission machinery in QCSubmit

    • STANDARDSv3 submission machinery in qca-dataset-submission

    • Multiple PR templates

      • New submission

      • Compute expansion

      • Infrastructure modification

  • Upcoming science support

  • Larger advances

    • Automated FF coverage gap identification, torsion prioritization, submission generation

    • Benchmarking (dashboard, etc.)

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

geomeTRIC memory limits

Trevor

  • geomeTRIC has no memory constraint machinery

    • TG: decided to let a molecule run indefinitely, found that geomeTRIC had held on to more than 1000 B-matrices

    • made Lee-Ping aware; suggested these don’t need to be held on to

    • [proposal] profile what geomeTRIC is doing, start with B-matrices limit for storing

      • TG: will create an issue on geomeTRIC detailing the observations; DD will pursue solution

High-mem, low-mem split

Trevor

  • May make sense to split out high-mem, low-mem jobs

    • DD: we could do this with compute tags; on my list is to make this easy to declare from GitHub

Torsiondrive priority resets

Trevor

  • Need to make it possible to change priority, compute tag for services

    • TG: if we’re going to have many compute tags, need a nice way to query what exists, how many tasks on each

Torsiondrive submission broken in QCSubmit with threading!=1

Josh

  • In an awkward place for QCSubmit; supporting benchmarking work, but also need to switch to using openff-toolkit

    • DD: should be good to leave the 0.2.x as it is for benchmarking, not anticipating changes needed since most partners not using a server approach

      • use 0.3.x as the openff-toolkit basis, breaking change

    • JH: also planning to migrate packaging to conda-forge for 0.3.x

      • would like to get STANDARDSv3 machinery in there as well if possible

  • On the threading for torsiondrives, not clear what the problem is at the moment

    • TG: when you have a shared self, a dictionary is being modified by the function called; this is shared state that can be unpredictable since it amounts to a race condition

    • TG: we’re using threads because it’s low-memory, but could sidestep this easily with switching to a process pool

    • DD: probably makes sense to switch to processes since these submissions do a good deal of client-side processing

    • TG: will need to be able to set processes used on submission, since memory usage can be high

    • DD: [committed] will pursue switch to process pool for submission

Compute expansion rework

Josh

  • Would like feedback on proposal; definitely want to improve the picture there, though we have time

Optimization and Torsiondrive starter notebooks

Trevor

  • TG: been working on some candidate notebooks; finding issues in QCSubmit and raising them, Josh largely addressing

    • waiting for fixes before PRing notebooks

PR templates

David

  • Fairly easy task, will be adding multiple PR templates to improve submitter experience

New submissions

David

  • DD: reviewed torsiondrive benchmarks, just need an input file

Action items

@Trevor Gokey will make issue on geomeTRIC with memory usage observations, in particular B-matrices piling up
@David Dotson will address geomeTRIC memory leaks, likely starting with allowing B-matrices be dropped after use (or limiting their caching)
@David Dotson will add ability to set submission compute tag on Github PR declaratively; in particular, want to be able to split on high-mem, low-mem
@David Dotson will add a way to get compute tags, list of tasks represented in each, from FractalClient
@David Dotson will add ability to set submission priority on Github PR declaratively
@David Dotson will add ability to change priority, compute tag on services, e.g. TorsionDrives
@David Dotson will make PR to QCSubmit changing submission to a ProcessPool over ThreadPool
@David Dotson will add multiple PR templates (new submission, compute expansion, infrastructure modification) to improve user experience on qca-dataset-submission
@Joshua Horton will add input file selected_mol2s to TorsionDrive benchmark consolidation submission
@Joshua Horton will proceed forward with openff-qcsubmit 0.3.x as desired; benchmarking will stay on 0.2.x, and unlikely requires patches or backports for this season
@Trevor Gokey will PR starter notebooks for Optimizations, TorsionDrives, to openff-qcsubmit when satisfied with functionality

Decisions