2022-10-18 QC Meeting notes

 Date

Oct 18, 2022

 Participants

  • @Pavan Behara

  • @Chapin Cavender

  • BenjaminPritchard

  • @David Dotson

 

 Discussion topics

Item

Notes

Item

Notes

Updates from MolSSI

  • BP: Nothing much to update, server is going good. DD and I can coordinate on qcsubmit updates.

    • DD: Sometime in Dec is the timeline.

    • BP: Okay, we can touch base in two weeks then.

    • DD: There are lot of things in qcf-next that will replace some of the functionalities of qcsubmit, we can revamp those.

Infrastructure advances

  • Geometric 1.0 released, thanks to LPW. Thanks to David Dotson for updating the prod envs.

    • DD: Will merge the PR and kick off workers with the new env. I will ping folks on slack to update their worker envs.

Throughput status

  • OpenFF Protein Capped 3-mer Backbones v1.0

    • Opts: 289670 → 293477 → 299557

    • TDs: 12 → 16 → 19

  • RNA Single Point Dataset v1.0 - moved to end-of-life after Ken Takaba’s confirmation.

User questions/issues

  • PE, JC questions on spice sets (Thanks to DD for coordination)

    • non-deterministic database access

    • hdf5 files on QCA ML repo

    • DD: I will try to get feedback in a face-to-face meeting with BP/PE/JC. Objective is to draw some actionable items that doesn’t load the devs since QCA is not a traditional OSF/Zenodo/dropbox style system to store files.
      I remember JH/SB cacheing QCA data for local runs.

    • BP: I talked to Ana and we do have traditional repos for immutable data.

    • DD: Yeah, an exporter would be great.

    • DD: May be renaming the proj to QCF would let people know that it’s more of a compute engine, rather than data storage server.

  • BP: I think we can place an ML tag on the datasets to show them on the ML-repo.

    • DD: Is the metadata editable?

    • BP: I think so, o/w possible with qcf-next.

    • DD: What’s the exact name of the tag?

    • BP: Might be “machine learning”

    • DD: So, we still have to add a file and generate the hdf5 link?

    • BP: Yeah.

    • DD: These are all not computed by QCF?

    • BP: Yeah, these are aggregated from data submissions. They’re all Zenodo links.

    • DD: Okay, I will try to do the same for Spice sets.

  • BP: For the past few weeks I am assigning users to groups and will change file/access ownerships, and how one can use/modify records.

 

  • CC: Can we redo some of the protein datasets with the new geometric release?

    • DD: Sure, we can slide them back to error cycling on the dataset tracking page.

Science needs

 

 Action items

 Decisions