2023-11-21 QCSubmit dataset submission meeting

Participants

  • @David Dotson

  • @Brent Westbrook (Unlicensed)

  • @Alexandra McIsaac

  • @Lily Wang

  • @Matt Thompson

Goals

  •  

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Handoff of compute management for NRP Nautilus

David Dotson/ open discussion

  • DD: Just met with BP, we have compute workers now

  • DD: NRP Nautilus working as far as we can tell

  • DD: Wants to hand off compute management since he’s not going to have this in his scope next year. Who should it be?

    • LW: Happy to take it over. Should infrastructure be involved?

    • MT: Could be involved, if you think it would be helpful, or could just be you

    • LW: Happy to take it over initially, could revisit if we change our minds

  • DD: Jeff is officially admin of OpenFF namespace, I might be able to give you permissions but if not may have to wait for Jeff. Once LW has an account, will do a training to show how to manage compute resources, how to manage environments, etc

  • DD: Should start seeing completions on dataset from last meeting soon, got compute up, will try to use 300 workers


General discussion



  • LW: How are scripts, etc looking in QCA dataset submission?

    • DD: Looking good, should all work now, error cycling reports are showing up again and GitHub label prioritization should be working and compute tagging

    • DD: “Should” be working but haven’t tested it for production

    • DD: Updated TorsionDrive dataset error cycling, should also be working, complicated because they generate their own optimizations

    • DD: Would be good to try a TD dataset soon to see how it’s working

  • DD: Currently 3 environments, psi4, OpenMM, XTB, also torchani environment which is currently broken but Lily has a fix (downgrade pytorch)

  • DD: Current dataset has one spec (HF/6-31G*). If in the future, you can also add more specs to same dataset by making a PR where you make a compute.json file in same directory instead of a whole new dataset

    • Example:

    • Leave dataset object empty, signifies to QCA that it’s a compute expansion for an existing dataset

    • LW: If you wanted to expand compute specs, would you do so by editing existing compute.json, or make a new one?
      DD: Make a new one, it just has to follow compute[*].json, can put other stuff after compute

Action items

Decisions