Page Comparison

...

\uD83D\uDC65 Participants

\uD83E\uDD45 Goals

Updates from MolSSI
Infrastructure needs/advances
Throughput status
- Openff dipeptides torsiondrives v2.1: 25/26 TD complete
  - a single calculation not getting any workers (?)
- OpenFF Protein Capped 1-mer Sidechains v1.1: 29 to 36/46 TD
  - CC submitted a revised version last week with modified initial structures
- SPICE PubChem Set 1 Single Points Dataset v1.2: 128 are consistent errors with SCF convergence or MBIS charge convergence, can be moved to end-of-life
- SPICE PubChem Set 2 Single Points Dataset v1.2: From 40538 to 84398 calcs, making progress, a bit slowed down in the last few days (used to be around 8K - 10K calcs a day, dropped to 4K a day)
User questions/issues
Science support needs

\uD83D\uDDE3 Discussion topics

...

Time

...

Item

...

Presenter

...

Item
Notes
Updates from MolSSI
BP: Copy of the new server is up and running and ready for testers.
Only has some datasets. Not sure what to do next. To install, check out the `next` branch of qcfratal-compute and qcportal.
What’s the best way to get feedback? GH Issues may be a bit slow, though we could use a tag. Could also use an issue template for `next`-related issues.
Slack channel on QCArchive could work.
(BP invited CC and PB)
https://qcarchive.slack.com/join/shared_invite/zt-5ekow8t2-FxtMDHWnc0NOCS1YcadKBQ
DD – From OpenFF’s perspective, I’m interested to see how the submission-compute-retrieval cycle looks now. So I may make a copy of qca-dataset-submission and try submitting old datasets to see how they go.
BP – That sounds like a great idea. Except that there may be problems if the dataset has been seen before. But feel free to submit experimental sets, since they’ll be deleted after the testing period.
DD – Ok, I’ll plan on basically resubmitting a recent dataset to see how it goes through. And the submission machinery should be just a fork of qca-dataset-submission.
BP – That should work. The test server has data from last ~september, so newer datasets should look unique. Previously-existing usernames and passwords should work.
CC + PB – I’m mostly interested in testing data retrieval and checking the status of ongoing datasets.
DD will make a copy of `qca-dataset-submission` and try resubmitting an old dataset that the migrated server doesn’t know about. So probably a protein torsiondrivfe set, and a big single point set.
BP – The managers should be roughly the same (just be sure to use the qcfractal-compute repo now). Also datasets must now have tags. The old URL will work, and the connections using the new versions will get routed to the new server.
BP will drop install instructions onto new slack channel
BP – No docs yet, I’ll need to get those built.
BP – Reaction datasets aren’t done yet, but that doesn’t matter for you.
Infrastructure needs/advances
DD – I’ve got the singlepoint-after-optimization functionality on my radar, haven’t made progress on that yet. Hoping to do so this week.
Throughput status
Openff dipeptides torsiondrives v2.1: 25/26 TD complete
a single calculation not getting any workers (?)
CC – This is the same as the one that’s been running for a long time, or erroring out with a GeomeTRIC error. So we can call this EOL.
JW – Is the 26th one usable? Or do you need to do manual intervention to make that usable?
CC – I’ll need to do manual intervention to pick one of the other optimizations from that grid point.
BP – Could that be automated in the future, and the torsiondrive marked complete?
CC – There should be OTHER optimizaitons that completed on that grid point. So we’d likely be fine taking the next-lowest energy.
BP – I wonder if we could update the TorsionDrive package to be tolerant of a small number of failures - basically the TorsionDrive package sees all the optimizations and reports a status. So we could change how the status is reported.
DD – Would there be a strategy of submitting duplicates that could get around this problem?
CC – That would work. I think that’s what DCole’s group does - Basically they seed
JW - each grid point consists of multiple opts as part of wavefront porpagation
CC – That’s true, but starting from multiple confs would be more likely to have at least ONE of them being complete, and then we could look at the different TorsionDrive jobs and either find the one that’s totally complete, or mark the group of TorsionDrives complete once it’s possible to stitch together torsiondrives to get all grid point completed.
DD – I’m suggesting we do independent replicates for each Torsiondrive in a set; not that these replicates have any awareness of each other, because they don’t in the current implementation. I’m saying we can make use of this.
PB – What’s the root cause here? Bad initial geometry?
CC – Maybe. But it’s important to note that there are like 10 completed opts on the grid point, it’s just the 11th that’s not running (but that’s unlikely to be the minimum energy one anyway).
PB + CC – Let’s move this to EOL
OpenFF Protein Capped 1-mer Sidechains v1.1: 29 to 36/46 TD
CC submitted a revised version last week with modified initial structures
PB – Acceptable progress?
CC – The progress looks good. I noticed that one structure got in with a bad initial structure so I’ll submit a PR this week that replaces that one.
DD – I still see this moving forward.
CC – So, don’t move to EOL yet.
SPICE PubChem Set 1 Single Points Dataset v1.2: 128 are consistent errors with SCF convergence or MBIS charge convergence, can be moved to end-of-life
SPICE PubChem Set 2 Single Points Dataset v1.2: From 40538 to 84398 calcs, making progress, a bit slowed down in the last few days (used to be around 8K - 10K calcs a day, dropped to 4K a day)
DD – This bottleneck is genuinely coming from the compute side - We’re getting a little throttled on PRP and Lilac. We’re doing SPICE only on PRP.
DD – We’ll be getting access to Max Planck cluster resources via Bert de Groot. Not sure how much throughput to expect. Very nice of Bert to offer that. Mostly GPU cluster so I’m not sure what to expect from CPU power. 128GB RAM max, so may not be suitable for SPICE. But for regular jobs that may be fine. Note that we’ll be at very low priority so I’m not sure how much throughput we’ll get.
User questions/issues
DD – Upcoming dataset needs?
PB – Not that I know of. But 400kish more SPICE jobs are ready for submission so we won’t be troubled by idle compute.
Science support needs

✅ Action items

⤴ Decisions

...

Versions Compared

Old Version 2

New Version 3

Key

\uD83D\uDC65 Participants

\uD83E\uDD45 Goals

\uD83D\uDDE3 Discussion topics

✅ Action items

⤴ Decisions

Item	Notes
Updates from MolSSI	BP: Copy of the new server is up and running and ready for testers. Only has some datasets. Not sure what to do next. To install, check out the `next` branch of qcfratal-compute and qcportal. What’s the best way to get feedback? GH Issues may be a bit slow, though we could use a tag. Could also use an issue template for `next`-related issues. Slack channel on QCArchive could work. (BP invited CC and PB) https://qcarchive.slack.com/join/shared_invite/zt-5ekow8t2-FxtMDHWnc0NOCS1YcadKBQ DD – From OpenFF’s perspective, I’m interested to see how the submission-compute-retrieval cycle looks now. So I may make a copy of qca-dataset-submission and try submitting old datasets to see how they go. BP – That sounds like a great idea. Except that there may be problems if the dataset has been seen before. But feel free to submit experimental sets, since they’ll be deleted after the testing period. DD – Ok, I’ll plan on basically resubmitting a recent dataset to see how it goes through. And the submission machinery should be just a fork of qca-dataset-submission. BP – That should work. The test server has data from last ~september, so newer datasets should look unique. Previously-existing usernames and passwords should work. CC + PB – I’m mostly interested in testing data retrieval and checking the status of ongoing datasets. DD will make a copy of `qca-dataset-submission` and try resubmitting an old dataset that the migrated server doesn’t know about. So probably a protein torsiondrivfe set, and a big single point set. BP – The managers should be roughly the same (just be sure to use the qcfractal-compute repo now). Also datasets must now have tags. The old URL will work, and the connections using the new versions will get routed to the new server. BP will drop install instructions onto new slack channel BP – No docs yet, I’ll need to get those built. BP – Reaction datasets aren’t done yet, but that doesn’t matter for you.
Infrastructure needs/advances	DD – I’ve got the singlepoint-after-optimization functionality on my radar, haven’t made progress on that yet. Hoping to do so this week.
Throughput status	Openff dipeptides torsiondrives v2.1: 25/26 TD complete a single calculation not getting any workers (?) CC – This is the same as the one that’s been running for a long time, or erroring out with a GeomeTRIC error. So we can call this EOL. JW – Is the 26th one usable? Or do you need to do manual intervention to make that usable? CC – I’ll need to do manual intervention to pick one of the other optimizations from that grid point. BP – Could that be automated in the future, and the torsiondrive marked complete? CC – There should be OTHER optimizaitons that completed on that grid point. So we’d likely be fine taking the next-lowest energy. BP – I wonder if we could update the TorsionDrive package to be tolerant of a small number of failures - basically the TorsionDrive package sees all the optimizations and reports a status. So we could change how the status is reported. DD – Would there be a strategy of submitting duplicates that could get around this problem? CC – That would work. I think that’s what DCole’s group does - Basically they seed JW - each grid point consists of multiple opts as part of wavefront porpagation CC – That’s true, but starting from multiple confs would be more likely to have at least ONE of them being complete, and then we could look at the different TorsionDrive jobs and either find the one that’s totally complete, or mark the group of TorsionDrives complete once it’s possible to stitch together torsiondrives to get all grid point completed. DD – I’m suggesting we do independent replicates for each Torsiondrive in a set; not that these replicates have any awareness of each other, because they don’t in the current implementation. I’m saying we can make use of this. PB – What’s the root cause here? Bad initial geometry? CC – Maybe. But it’s important to note that there are like 10 completed opts on the grid point, it’s just the 11th that’s not running (but that’s unlikely to be the minimum energy one anyway). PB + CC – Let’s move this to EOL OpenFF Protein Capped 1-mer Sidechains v1.1: 29 to 36/46 TD CC submitted a revised version last week with modified initial structures PB – Acceptable progress? CC – The progress looks good. I noticed that one structure got in with a bad initial structure so I’ll submit a PR this week that replaces that one. DD – I still see this moving forward. CC – So, don’t move to EOL yet. SPICE PubChem Set 1 Single Points Dataset v1.2: 128 are consistent errors with SCF convergence or MBIS charge convergence, can be moved to end-of-life SPICE PubChem Set 2 Single Points Dataset v1.2: From 40538 to 84398 calcs, making progress, a bit slowed down in the last few days (used to be around 8K - 10K calcs a day, dropped to 4K a day) DD – This bottleneck is genuinely coming from the compute side - We’re getting a little throttled on PRP and Lilac. We’re doing SPICE only on PRP. DD – We’ll be getting access to Max Planck cluster resources via Bert de Groot. Not sure how much throughput to expect. Very nice of Bert to offer that. Mostly GPU cluster so I’m not sure what to expect from CPU power. 128GB RAM max, so may not be suitable for SPICE. But for regular jobs that may be fine. Note that we’ll be at very low priority so I’m not sure how much throughput we’ll get.
User questions/issues	DD – Upcoming dataset needs? PB – Not that I know of. But 400kish more SPICE jobs are ready for submission so we won’t be troubled by idle compute.
Science support needs