2020-12-11 QCA Submission Meeting notes

Date

Dec 11, 2020

Participants

  • @David Dotson

  • @Trevor Gokey

  • @Pavan Behara

  • Ben Pritchard

  • @Joshua Horton

Goals

  • New advancements

    • STANDARDS-based versioning #137

  • New submissions

  • Upcoming infrastructure improvements

    • Dataset index on qca-dataset submission #147

    • Stale jobs fix on QCFractal server/managers

  • Upcoming science support

    • Enforced c1 symmetry in psi4 is almost ready

  • Larger advances

    • Automated FF coverage gap identification, torsion prioritization, submission generation

    • Benchmarking (dashboard, etc.)

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Stale job updates

Ben

  • Ben is planning a update that should fix this in future

  • e.g. manager.manager_name: PacificResearchPlatformMM

    • inconsistency with how this is stored server-side (cluster)

Database error provenance

Ben

  • Error records grow monotonically in database; any reason to keep them?

    • should we delete ones that have no references to them elsewhere in the DB?

    • TG: perhaps the sequence of errors should be preserved if we have yet to complete a result?

      • only delete the sequence of errors once it’s complete?

STANDARDSv3

Trevor

  • DD: Trevor presents STANDARDS, discuss comments addressed, open floor discussion before we move to adopt

  • TG: CMILES provenance info still unresolved; looking for a solution that doesn’t introduce complexity on top of QCArchive’s built-in approaches if possible

    • DD: Not sure it’s possible to achieve the information structure we want without introducing some complexity on top, handled via a data access layer that encodes these STANDARDS

    • BP: Thinking on an approach to support this more deeply within QCArchive

      • Thinking of cases that require changes QCSchema, will require some thought around the whole stack

    • DD: I don’t think this is a blocker for STANDARDSv3; think we could establish a working group for improving graph molecule support in the whole QC* stack, which would translate to better MM and ML support more generally

    • BP: there is an MMSchema effort in MolSSI; this may be a good route to pursue for building out a representation that can be consumed by QC*

    • JH: Tried building some WBO estimators from wavefunction data; not straightforward, relies on finnicky cutoff choices

    • BP: thinking of building a table that stores connectivity, where there can be multiple records for a given molecule, storing provenance info for the tool that generated/interpreted out that connectivity from the geometry, wavefunction, etc.

    • DD: think there is an opportunity to build a library that ties together QC mols to MM mols more generally, given interpretations from tools like RDKit, OpenEye, etc. Could then be used to populate the table that Ben suggested

    • TG: connectivity, formal charge, if we had those, really could make the whole graph; could then use that instead of CMILES if desired

    • 3 ayes, 2 abstain

    • Adopted!

    • JH: happy to walk through doc, bust out into issues primarily on QCSubmit, put together into STANDARDSv3 project board for execution

    •  

 

 

  • DD: will schedule working session with Josh, Hyesu for consolidating benchmark datasets, adding mols

Action items

Ben Pritchard will pursue resolving inconsistency in manager.manager_name in Manager configuration vs. cluster key stored server-side
Ben Pritchard will consider an approach to pruning old, unreferenced errors accumulating in server DB
@Joshua Hortonwill walk through STANDARDSv3, translate components that need to be implemented yet into issues on QCSubmit, qca-dataset-submission automation, etc.; will set up a project board on QCSubmit to parallelize implementation work with others
@David Dotson will schedule a working session with Hyesu, Josh for consolidating benchmark datasets, adding molecules

Decisions