/
2020-09-25 QCA Submission Meeting notes

2020-09-25 QCA Submission Meeting notes

Date

Sep 25, 2020

Participants

  • @David Dotson

  • @Trevor Gokey

  • @Joshua Horton

  • @Simon Boothroyd

  • Ben Pritchard

Goals

  • New advancements

  • New submissions

    • Protein Fragments TorsionDrives

    • Enamine REAL subset Optimization

    • Jessica Maat’s dataset

    • PhAlkEthOH dataset

  • Upcoming infrastructure improvements

    • Psi4Harness error reporting fix QCEngine#266

    • STANDARDS-based versioning #137

    • Torchani changes - QCEngine release

    • Identifier support in QCElemental.Molecule?

  • Upcoming science support

    • PCM-based implicit solvent pathway

    • ESPs and wavefunction storage

    • Uploading datasets calculated on private server

  • Larger advances

    • Automated FF coverage gap identification, torsion prioritization, submission generation

    • Benchmarking (dashboard, etc.)

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

QCFractal release, QCArchive upgrade

Ben

QCFractal release on track for next week

  • modify and regenerate tasks

  • maybe have the ability to set defunct status

  • changes the modify_tasks

  • Considering upgrading Postgres 10 → 13

Reworking fractal interface is going to be a challenging task

  • TG: We need to change some data where the status is incomplete but the data exists

  • BP: The new modify_tasks will create new tasks, and this will allow the job to be restarted. The problem could be the underlying task was already deleted, and couldn’t restart.

  • We should use the create task functionality on a debugging basis only

There is some talk about QCSchema changes

 

Entry creation

Trevor

  • TG: Entries aren’t created until tasks are created

    • dataset doesn’t show up until you call compute

  • BP: One of many things that needs a rework

Protein Fragments

David

  • Need to go through constraints with fine-tooth comb

Enamine REAL

Trevor

  • Working on performance optimizations for QCSubmit to handle a dataset of this size

PhAlkEthOH dataset

Trevor

  • Small dataset that was used for initial fit

    • finally getting into QCArchive

    • useful sandbox to test typing schemes

    • ~40k conformers (different from the conformers used in the original dataset)

STANDARDS-based versioning

Trevor

  • Coupled with QCSubmit development

QCEngine Release

Ben

  • QCEngine release coinciding with Fractal next week

QCElemental Identifiers

Trevor, Josh

  • TG: it is implemented, present

  • JH: We pushed for using Identifiers for CHMILES early on, but because this isn’t necessarily stable information if the connectivity changes over the course of the optimization

    • conclusion: we didn’t want to put information that was searchable and meant to reflect the molecule as is but could very well be inaccurate

    • the compromise was to put this information into extras since that is free-form and not considered to be as much of a source of truth

  • TG: is it possible to guess connectivity at every optimization step?

    • or do we have a postprocessing step at the end that guesses connectivity after an optimization?

    • The drawback to the current implementation in QCElemental is that it does not perceive bond order.

PCM calculations

Josh

  • Can do PCM calculations, need to restrict symmetry

    • latest conda build for Mac!

  • Just need to update QCSubmit with corresponding options

Data Access layer in service of future advancements

David, discussion

  • DD: value streams, upcoming needs a data access layer would need to meet

    • SB: really agree with the approach; partial to fewer packages

    • TG: agree, will be working on QCSubmit more to include extraction bits, make sure these line up with input standards

    • JH: agree, and like this approach; have some minimal data access bits currently to support bespoke-workflow

    • TG: have a lot of existing functions that I can put into QCSubmit

QCFractal managers

 

TG: Using a pre-empt queue causes a lot of jobs to be killed, which are sent back as ERRORS, and need to be restarted manually. Can this be detected and have those jobs automatically restarted?

DD & BP: This will be challenging as the exceptions are adapter specific

TG: Short term solution is a loop that scans the dataset from a client and restarting tasks with specific exception strings in the output

Action items

Ben Pritchard will prepare a QCFractal release for next week, with deployment to the public QCArchive instance
@Trevor Gokey will prepare PhAlkEthOH dataset for QCSubmit submission into QCArchive
@Joshua Horton will update QCSubmit with options needed to support PCM calculations with recent Psi4
@David Dotson will create a prototype for Jessica Maat, Dominic Rufa’s requests as a PR to QCSubmit (followed discussion with Trevor)

Decisions