...
Goals
Updates from MolSSI
Compute
New submissions
User questions/issues
Science support needs
Infrastructure needs / advances
Discussion topics
Item | Notes |
---|
General updates | DD – Moving forward, Pavan will be running this call and the QC submission call. I’ll be in attendance, but he’ll be leading it. PB – I’d like to change he meeting time to the same time on Tuesday
|
Updates from MolSSI
| major milestones in refactoring; changing how things are packaged, will make things a lot easier BP – Planning an informal workshop for after the refactor is done. Maybe February? I’ll keep you informed. BP – Now that I have the manager stuff working andcan run calcs on the new version, I’d like to work with you to try running ~100k calculations and seeing how things scale. DD – I’d love that and can help. BP – I’ll share how to do this, one big new change is that everything must have tags. Previously blank tags would pull down all jobs, but now it must explicitly be set to * to pull down any tag.
DD – Does the permission system (in terms of resources we can access) work with namespaces? So, like, if I have a user account specifically for running managers, I’d like it to pull down tasks only from my tags. I’m concerned about a malicious actor running a weird manager and returning results for my tagged calcs.
|
Compute | |
New submissions | |
...
User questions/issues
...
Science support needs
...
Infrastructure needs / advances
Discussion topics
...
Item
...
Presenter
...
Notes
Action items
- Good progress, 3 of 5 TorsionDrives complete v2.0 - Expanded number of amino acids from 3 to 26, PR here: Github link macro |
---|
link | https://github.com/openforcefield/qca-dataset-submission/pull/264 |
---|
|
v2.0 dataset validation fails due to file rename. This seems to be a known bug in trilom/file-changes-action v1.2.3, fixed in v1.2.4 Github link macro |
---|
link | https://github.com/trilom/file-changes-action/issues/100 |
---|
|
DD – Nice debugging, could you update this in the submission PR? CC – Yes DD – PB, can you review this, or should I? PB – I’ll review it, or tap another person if I don’t have time.
|
New submissions | SB – There’s one that I’m considering, but I’m not sure how to move forward. I want to generate a lot of wavefunctions - Basically take 40000 molecules with up to 12 heavy atoms, make 5 confs of each, then optimize (either two-stage like XTB-then-HF-6-31G*), then get wavefunctions of the final results. DD – This could be a challenge. Would need to be a multi-operation dataset. So, it’d need to start with an XTB optimization dataset, then the results of that would need to go to a HF-6-31G* optimization set, then a single point wavefunction set SB – QCSubmit has some new API points that should help start optimizations from the completed results of a previous set. Though I’m kinda wondering whether this will conform with the provenance requirements for our datasets/general QCFractal use. SB – So, in terms of operations, do we anticipate size issues for submission? Storage? And what would it look like timeline-wise? Compute time DD – Timeline-wise, with 6-31G*, I think Hyesu may have done some… Could you try running the optimizations on lilac, with the goal of preparing the inputs for wavefunction calcs?
Storage space 200,000 wavefunctions x 5 MB = 1 TB BP – 6-31G* would be smaller then def2-tzvp, and 12 heavy atoms is also smaller than pubchem. So we may not hit any giant storage space issues here.
Dataset size
|
Action items
- Pavan Behara will reschedule QCArchive submission and user group calls for Tuesdays, 8am PT
- David Dotson will work with Ben Pritchard to do burn tests with new QCFractal instance, calculations mimicking production
- Pavan Behara will attempt to reproduce high memory requirements of certain SPICE records, e.g. #257; raise with
psi4
devs - David Dotson will keep high-memory nodes up for SPICE and monitor usage; if it looks like we are past calculations with high requirements, will re-deploy many smaller workers
- David Dotson will deploy workers with priority for new dipeptide submission
- David Dotson will push remaining pubchem set submissions through local infrastructure
- Simon Boothroyd will prepare HF-6-31G* set on Lilac, prepare single point dataset from final conformers; may need to be multiple datasets at 200,000 calculations
Decisions