2021-11-12 QC meeting notes

Participants

@Chapin Cavender
@Pavan Behara
Ben Pritchard
@David Dotson
@Jeffrey Wagner
@Willa Wang

Goals

Updates from MolSSI
Compute
- QM-200g workers on Lilac
- ANI and MM workers on Newcastle
- QM, ANI workers on PRP
New submissions
- dipeptide dataset
- imposed electric field in QM - wavefunction not stored for optimizations
- ML dataset for OpenMM
User questions/issues
- following tasks stuck in RUNNING state: ['12669181', '12669182', '12669211', '12669338', '12669339', '12669341']
  - result ids: ['76498626', '76498467', '76498496', '76498466', '76498624', '76498623']
- b97 fix; does it work for existing datasets, or only new submissions?
Science support needs
- new openff-qcsubmit release
Infrastructure needs / advances
- psi4 on conda-forge
- update prod envs to use latest qcelemental, qcenegine, qcfractal

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
Updates from MolSSI	Ben	BP: upgraded public QCA with 0.15.7, went very smoothly DD: indeed! Thank you for this; we saw no disruptions on our end JW + PB – This looks great to us BP: do still have traffic routing through backup host (General) – This actually works better than before. BP – Maybe this is because the new gateway has higher bandwidth BP: working on improvements in `next` failed calculation history; necessary for in-server error cycling; also doing tons of renaming to get consistency in verbiage in the codebase (e.g. datasets instead of alternatively collections and datasets) Tentatively looking for Feb release BP: saw new QCFractal issues raised by @Lily Wang ; many of these are going to be addressed in `next` JW – Maybe worth declaring a feature freeze on the `master` branch? DD – I’ll try to intercept these requests and communicate plans to the public. DD – Highmemory jobs fell into a weird state: following tasks stuck in RUNNING state: `['12669181', '12669182', '12669211', '12669338', '12669339', '12669341']` result ids: `['76498626', '76498467', '76498496', '76498466', '76498624', '76498623']` BP – Updated from my end – These should be fixed now DD – How does this happen? Race condition in the manager? BP – Yeah. It can happen when a task gets assigned to a manager, but the manager shuts down at the same time. In the `next` branch, there will be a locking of the manager during this time. PB – Question regarding B97 fix – Does this work for existing datasets, or only new submissions. BP – This should work for existing datasets. Though we may need to do recomputations. You may need to call `.compute()` on those again – I’m pretty sure this will work. DD – That makes sense. I’ll push PR #227 through the submission pipeline again. PB – For new submissions, will it split into two calculations? Like into functional and dispersion? BP – This won’t split those, since psi4 doesn’t understand the functional part. PB – I still see two sets of records for the new submission #239 Like, when I run `client.get_collection` and then `recs=get_records`, then `len(recs)` returns `2` instead of `1` BP – I think this is due to using an old qcportal. Should be fixed after upgrade – Should have 0.15.7. This problem should be entirely local. PB – Would this have been a problem if I also used an old qcsubmit? BP + DD - I don’t think so. It shouldn’t be necessary to rerun things for the new submissions.
Existing datasets		DD – Any datasets need reprioritization? WW – The RESP Polarizability dataset #235 seems to have topped out with about 20% failures for some basis sets. DD – We’ll need to pull down wavefunction info, which may require some special code. I can work with you on this on this next week. We can also get this running locally on your resources. WW – Sounds great. We’ll meet on this next week. WW – Is there another dataset that has wavefunction stored that I can look at? DD – BCC refit study 1 dataset has stored wavefunctions – `OpenFF BCC Refit Study COH c1.0` DD – Do you have local HPC that you can use for your jobs? WW – We have TSCC. CC – I also use TSCC. DD – One sticking point is that running locally also requires running a local server. This will require a machine that’s long-lived and that can talk to the worker nodes on the clusters. PB – I’m running the server on a 14 hour cluster job. Is the goal to add compute for certain global QCA jobs, or to run local work DD – Goal is to have a place to do large-scale local execution that doesn’t communicate with global QCA. CC – I’ll send my TSCC config to WW DD – CC, we’ve added support for additional keywords for torsiondrive-level constraints (qcsubmit #172). You should be able to install from qcsubmit on GitHub. JHorton will make a new release of this on Monday, then we can submit your tasks. The change is in the new keywords in QCSubmit’s `TDSettings` in `common_structures`. CC – Sounds great.
New submissions		TorsionNet Solvated amino acids PB – these datasets should help OpenFF; we have had complaints that training and benchmarking are done with same QM method/basis; these datasets will give us greater variety to test against JW – had seen with pubchem set1, looked like we have funny conformer counting PB – was related to an issue with inchi keys JW – saw that there were an average of 50 conformers generated per molecule; seemed suspicious there are more than 1k unique graphs in this dataset? PB – yes, and each one has 50 conformers, one has exactly 100 JW – oh, I was under the impression we were generating conformers! PB – no, Peter generated his own conformers specifically for these molecule DD – There was just a comment that the dataset includes `I` PB – I’ll check into this. We’d need to use a basis set that will do the right thing for Iodine. DD – How should I proceed with the SPICE dataset? Should we wait to see success on SPICE before we submit the others (General) – Let’s proceed immediately with reviews of the others; can submit as desired after
Infrastructure needs		JW – psi4 on conda-forge isn’t guaranteed – We’re just looking into it at the moment and so we shouldn’t make blocking plans based on it.

Action items

@David Dotson will schedule a working session with @Willa Wang for local QCFractal execution on HPC for faster science iteration (so prod isn’t a bottleneck)

@Chapin Cavender will proceed with dipeptide dataset with openff-qcsubmit master branch; release coming around Monday from @Joshua Horton

@David Dotson will follow-up on remaining 6 tasks that were stuck from @Pavan Behara's single points dataset

@David Dotson will create single points dataset for @Willa Wang using final geometries from optimizations

@David Dotson will review Pavan’s ML datasets for OpenMM; hold off on merge until observe initial submission behavior

Meetings

2021-11-12 QC meeting notes

Participants

Goals

Discussion topics

Action items

Decisions

Related content