| |
---|
Updates from MolSSI | BP: Copy of the new server is up and running and ready for testers. Only has some datasets. Not sure what to do next. To install, check out the next branch of qcfratal-compute and qcportal. What’s the best way to get feedback? GH Issues may be a bit slow, though we could use a tag. Could also use an issue template for next -related issues. Slack channel on QCArchive could work.
DD – From OpenFF’s perspective, I’m interested to see how the submission-compute-retrieval cycle looks now. So I may make a copy of qca-dataset-submission and try submitting old datasets to see how they go. BP – That sounds like a great idea. Except that there may be problems if the dataset has been seen before. But feel free to submit experimental sets, since they’ll be deleted after the testing period. DD – Ok, I’ll plan on basically resubmitting a recent dataset to see how it goes through. And the submission machinery should be just a fork of qca-dataset-submission. BP – That should work. The test server has data from last ~september, so newer datasets should look unique. Previously-existing usernames and passwords should work. CC + PB – I’m mostly interested in testing data retrieval and checking the status of ongoing datasets. DD will make a copy of qca-dataset-submission and try resubmitting an old dataset that the migrated server doesn’t know about. So probably a protein torsiondrivfe set, and a big single point set. BP – The managers should be roughly the same (just be sure to use the qcfractal-compute repo now). Also datasets must now have tags. The old URL will work, and the connections using the new versions will get routed to the new server. BP will drop install instructions onto new slack channel BP – No docs yet, I’ll need to get those built.
BP – Reaction datasets aren’t done yet, but that doesn’t matter for you.
|
Infrastructure needs/advances | |
Throughput status | Openff dipeptides torsiondrives v2.1: 25/26 TD complete a single calculation not getting any workers (?) CC – This is the same as the one that’s been running for a long time, or erroring out with a GeomeTRIC error. So we can call this EOL. JW – Is the 26th one usable? Or do you need to do manual intervention to make that usable? CC – I’ll need to do manual intervention to pick one of the other optimizations from that grid point. BP – Could that be automated in the future, and the torsiondrive marked complete? CC – There should be OTHER optimizaitons that completed on that grid point. So we’d likely be fine taking the next-lowest energy. BP – I wonder if we could update the TorsionDrive package to be tolerant of a small number of failures - basically the TorsionDrive package sees all the optimizations and reports a status. So we could change how the status is reported. DD – Would there be a strategy of submitting duplicates that could get around this problem? CC – That would work. I think that’s what DCole’s group does - Basically they seed JW - each grid point consists of multiple opts as part of wavefront porpagation CC – That’s true, but starting from multiple confs would be more likely to have at least ONE of them being complete, and then we could look at the different TorsionDrive jobs and either find the one that’s totally complete, or mark the group of TorsionDrives complete once it’s possible to stitch together torsiondrives to get all grid point completed. DD – I’m suggesting we do independent replicates for each Torsiondrive in a set; not that these replicates have any awareness of each other, because they don’t in the current implementation. I’m saying we can make use of this.
PB – What’s the root cause here? Bad initial geometry? PB + CC – Let’s move this to EOL
OpenFF Protein Capped 1-mer Sidechains v1.1: 29 to 36/46 TD SPICE PubChem Set 1 Single Points Dataset v1.2: 128 are consistent errors with SCF convergence or MBIS charge convergence, can be moved to end-of-life SPICE PubChem Set 2 Single Points Dataset v1.2: From 40538 to 84398 calcs, making progress, a bit slowed down in the last few days (used to be around 8K - 10K calcs a day, dropped to 4K a day) DD – This bottleneck is genuinely coming from the compute side - We’re getting a little throttled on PRP and Lilac. We’re doing SPICE only on PRP. DD – We’ll be getting access to Max Planck cluster resources via Bert de Groot. Not sure how much throughput to expect. Very nice of Bert to offer that. Mostly GPU cluster so I’m not sure what to expect from CPU power. 128GB RAM max, so may not be suitable for SPICE. But for regular jobs that may be fine. Note that we’ll be at very low priority so I’m not sure how much throughput we’ll get.
|
User questions/issues | |
Science support needs | |