Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Item

Presenter

Notes

Updates from MolSSI

Ben

  • Fixed issues with inconsistent state

    • 10s of thousands of jobs in task queue associated with complete records

    • if you resubmit a record that already exists, even if COMPLETE, a task will get created

      • know where this hole is; can fix and produce new release, deploy to public QCArchive

  • BP: removed all duplicate tasks

    • DD: will turn managers back on to compute industry datasets, put them on their own compute tag and observe if backward-forward behavior persists

User issues, new submissions


  • PB: #220 - need new qcsubmit

    • JH: just have one PR on qcsubmit blocking release; working on this

    • PB: have one compute spec on this submission DF-CCSD(T)/CBS that will take up to 150 GiB of memory for 16 heavy atoms

    • PB: typically also use 48 cores

  • JH: working on #223, blocked by validation issues as well

    • going to add some ANI specs, will need some ANI workers

      • will still use ANI2x we had issues with, but relaxing convergence criteria

      • DD: until we have a release from Adrian, unfortunately can’t use the latest fixes to this in prod

  • JH: ML stuff, adding HDF5 support for QCSubmit

    • instead of a ton of SDFs, can use one file

    • JW: what are the contents

    • JH: conformers and mapped SMILES

    • JW: is this file going to contain the same content as the other files, or is there something fundamentally different here?

      • one thing that makes SDF safer is that readers and writers are not something we’re defining

      • JH: there’s a lot of repeated info in the SDF

        • also want to pave the way for multimolecule support, dimers, etc.

      • JW: good point

      • JH: understand concerns on future variability; would like to get a spec down as much as possible

    • JH: any feedback anyone has on this issue (

      Github link macro
      linkhttps://github.com/openmm/qmdataset/issues/4
      ) appreciated

  • DD: concerned about collection size; will run into same issue as before

    • SB+JH: not clear if it’s a single collection with a million conformers, or spread across several collections, or multiple million conformer collections

    • BP: the metadata object for a collection gets very big as more and more objects are involved (molecules, specs), so this becomes an issue in the way collections are currently implemented

      • is getting fixed in the next branch

    • SB: can see this taking another month for John and Peter to resolve; what is the timeline for next branch deployment?

      • BP: end of the year earliest? Can’t make a guarantee there, though

    • JW: DD, would you be willing to jump onto next OpenMM call to lay out constraints?

      • DD: yes

  • JH: is a test submission still in play?

    • SB: yeah, can push for this, also as a way to prove the core idea of the dataset works before we push through a massive set

  • SB: Chapin’s dataset; what’s the status?

    • DD: worked with him to set up manager on UCSD resources; can switch on and off at will; waiting for word on new submission status

    • SB: think there may still be some ambiguity on what data, how it will be different from the Cerutti sets; will coordinate with Chapin and see where we’re at

Science support needs

Infrastructure needs

  • BP: with this fix for the submission deduplication, can also include the fix for the slow queries encountered recently

    • this is adding indices to a single table; remove combined index, add a bunch of single indices to columns

    • is a DB migration in practice

    • will require more memory on the server; shouldn’t be an issue

  • PB: question for Josh

    • using one spec in new submission where method is a “ + “ joined method, no basis (https://psicode.org/psi4manual/master/cbs.html )

    • JW: short form for this method looks like it might present a discoverability issue; perhaps use a long form instead?

    • PB: should I leave it like this, or use long form?

      • JH: I think long form is supported through keywords; qcsubmit won’t like None for basis in psi4

Action items

  •  

Decisions