2024-03-12 QCSubmit dataset submission meeting


  • @Jeffrey Wagner

  • @Lily Wang

  • @Alexandra McIsaac

  • @Brent Westbrook (Unlicensed)

Discussion topics







Discussion of Lily and Brent’s previous submissions


  • https://github.com/openforcefield/qca-dataset-submission/projects/1

  • Doing scientific review for stalled datasets

  • XtalPi fragment TD

    • looked sensible, LW happy with timing, few errors

    • Moving to end of life (e.g. no longer paying attention to it, give up on remaining calculations that aren’t converging etc)

  • Brent’s torsion coverage dataset

    • BW: error cycling looks pretty good, happy to stop it, haven’t looked at results

    • JW: will move to EOL, if something winds up being weird about results it would be a new submission, rather than waiting for the few remaining jobs here to finish

  • LW: Jeff are you happy to be on top of compute?

    • JW: training/getting you creds for compute is on my to do list, will get on that for the future

    • LW: not high priority, just so you don’t have to deal with it

    • JW: doesn’t mind doing it, but probably easier to have someone else do it, cut out the middle man


  • LW: If I wanted to do 10,000 molecule dataset would it take up too much space? optimizations only

    • JW: no problem, may run into problems if you wanted TD or to save wavefunction

    • JW: have room for ~millions of calculations without wavefunctions. according to BP, current wavefunction storage averages as 3.27 MB/wf and scales at n^2 (n=number of basis functions), but don’t know how many basis functions lead to 3.27 MB so don’t know what the baseline is

  • JW: would be good to have a standardized workflow of when we need to ask BP about storage vs when we can just go for it. Have tons of compute but storage is limited

Action items
