2021-11-12 QC meeting notes

Participants

  • @Chapin Cavender

  • @Pavan Behara

  • Ben Pritchard

  • @David Dotson

  • @Jeffrey Wagner

  • @Willa Wang

Goals

  • Updates from MolSSI

  • Compute

    • QM-200g workers on Lilac

    • ANI and MM workers on Newcastle

    • QM, ANI workers on PRP

  • New submissions

    • dipeptide dataset

    • imposed electric field in QM - wavefunction not stored for optimizations

    • ML dataset for OpenMM

  • User questions/issues

    • following tasks stuck in RUNNING state: ['12669181', '12669182', '12669211', '12669338', '12669339', '12669341']

      • result ids: ['76498626', '76498467', '76498496', '76498466', '76498624', '76498623']

    • b97 fix; does it work for existing datasets, or only new submissions?

  • Science support needs

    • new openff-qcsubmit release

  • Infrastructure needs / advances

    • psi4 on conda-forge

    • update prod envs to use latest qcelemental, qcenegine, qcfractal

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Updates from MolSSI

Ben

  • BP: upgraded public QCA with 0.15.7, went very smoothly

    • DD: indeed! Thank you for this; we saw no disruptions on our end

    • JW + PB – This looks great to us

  • BP: do still have traffic routing through backup host

    • (General) – This actually works better than before.

    • BP – Maybe this is because the new gateway has higher bandwidth

  • BP: working on improvements in next

    • failed calculation history; necessary for in-server error cycling; also doing tons of renaming to get consistency in verbiage in the codebase (e.g. datasets instead of alternatively collections and datasets)

    • Tentatively looking for Feb release

  • BP: saw new QCFractal issues raised by @Lily Wang ; many of these are going to be addressed in next

    • JW – Maybe worth declaring a feature freeze on the master branch?

    • DD – I’ll try to intercept these requests and communicate plans to the public.

  • DD – Highmemory jobs fell into a weird state:

    • following tasks stuck in RUNNING state: ['12669181', '12669182', '12669211', '12669338', '12669339', '12669341']

    • result ids: ['76498626', '76498467', '76498496', '76498466', '76498624', '76498623']

    • BP – Updated from my end – These should be fixed now

      • DD – How does this happen? Race condition in the manager?

      • BP – Yeah. It can happen when a task gets assigned to a manager, but the manager shuts down at the same time. In the next branch, there will be a locking of the manager during this time.

  • PB – Question regarding B97 fix – Does this work for existing datasets, or only new submissions.

    • BP – This should work for existing datasets. Though we may need to do recomputations. You may need to call .compute() on those again – I’m pretty sure this will work.

    • DD – That makes sense. I’ll push PR #227 through the submission pipeline again.

    • PB – For new submissions, will it split into two calculations? Like into functional and dispersion?

    • BP – This won’t split those, since psi4 doesn’t understand the functional part.

    • PB – I still see two sets of records for the new submission #239

      • Like, when I run client.get_collection and then recs=get_records, then len(recs) returns 2 instead of 1

    • BP – I think this is due to using an old qcportal. Should be fixed after upgrade – Should have 0.15.7. This problem should be entirely local.

    • PB – Would this have been a problem if I also used an old qcsubmit?

    • BP + DD - I don’t think so. It shouldn’t be necessary to rerun things for the new submissions.

Existing datasets



  • DD – Any datasets need reprioritization?

    • WW – The RESP Polarizability dataset #235 seems to have topped out with about 20% failures for some basis sets.

    • DD – We’ll need to pull down wavefunction info, which may require some special code. I can work with you on this on this next week. We can also get this running locally on your resources.

    • WW – Sounds great. We’ll meet on this next week.

    • WW – Is there another dataset that has wavefunction stored that I can look at?

    • DD – BCC refit study 1 dataset has stored wavefunctions – OpenFF BCC Refit Study COH c1.0

    • DD – Do you have local HPC that you can use for your jobs?

      • WW – We have TSCC.

      • CC – I also use TSCC.

      • DD – One sticking point is that running locally also requires running a local server. This will require a machine that’s long-lived and that can talk to the worker nodes on the clusters.

      • PB – I’m running the server on a 14 hour cluster job. Is the goal to add compute for certain global QCA jobs, or to run local work

      • DD – Goal is to have a place to do large-scale local execution that doesn’t communicate with global QCA.

      • CC – I’ll send my TSCC config to WW

  • DD – CC, we’ve added support for additional keywords for torsiondrive-level constraints (qcsubmit #172). You should be able to install from qcsubmit on GitHub. JHorton will make a new release of this on Monday, then we can submit your tasks. The change is in the new keywords in QCSubmit’s TDSettings in common_structures.

    • CC – Sounds great.

New submissions

 

  • TorsionNet

  • Solvated amino acids

  • PB – these datasets should help OpenFF; we have had complaints that training and benchmarking are done with same QM method/basis; these datasets will give us greater variety to test against

  • JW – had seen with pubchem set1, looked like we have funny conformer counting

    • PB – was related to an issue with inchi keys

    • JW – saw that there were an average of 50 conformers generated per molecule; seemed suspicious

      • there are more than 1k unique graphs in this dataset?

      • PB – yes, and each one has 50 conformers, one has exactly 100

      • JW – oh, I was under the impression we were generating conformers!

      • PB – no, Peter generated his own conformers specifically for these molecule

    • DD – There was just a comment that the dataset includes I

      • PB – I’ll check into this. We’d need to use a basis set that will do the right thing for Iodine.

  • DD – How should I proceed with the SPICE dataset? Should we wait to see success on SPICE before we submit the others

    • (General) – Let’s proceed immediately with reviews of the others; can submit as desired after

Infrastructure needs

 

  • JW – psi4 on conda-forge isn’t guaranteed – We’re just looking into it at the moment and so we shouldn’t make blocking plans based on it.

Action items

@David Dotson will schedule a working session with @Willa Wang for local QCFractal execution on HPC for faster science iteration (so prod isn’t a bottleneck)
@Chapin Cavender will proceed with dipeptide dataset with openff-qcsubmit master branch; release coming around Monday from @Joshua Horton
@David Dotson will follow-up on remaining 6 tasks that were stuck from @Pavan Behara's single points dataset
@David Dotson will create single points dataset for @Willa Wang using final geometries from optimizations
@David Dotson will review Pavan’s ML datasets for OpenMM; hold off on merge until observe initial submission behavior

Decisions

Â