2021-11-05 QC meeting notes

Participants

  • @Pavan Behara

  • @Jeffrey Wagner

  • @Joshua Horton

  • @Chapin Cavender

  • Ben Pritchard

Goals

  • Updates from MolSSI

    • QCFractal release + prod QCArchive upgrade

  • Compute

    • QM and MM workers on Lilac

    • ANI and MM workers on Newcastle

    • QM, MM, and QM-200g workers on PRP

  • New submissions

    • dipeptide dataset

    • imposed electric field in QM - wavefunction not stored for optimizations

    • ML dataset for OpenMM

  • User questions/issues

  • Science support needs

  • Infrastructure needs / advances

    • psi4 on conda-forge

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Updates from MolSSI

Ben

  • BP: how’s the server doing?

  • JW: still seeing sporadic failures, or at least have been last couple months

  • BP: have a dedicated(?) firewall that routes traffic to the QCArchive server; used for traffic inbound from outside VT

    • recently that firewall stopped forwarding traffic reliably

    • IT has not been helpful

    • currently traffic being routed via the backup server as a workaround

  • JW: seeing better success than before, but understand this isn’t a permanent solution

  • BP: Long-term options could be renting cloud compute (eg AWS) or building a server in the office.

  • DD: Lots of advantages to AWS

  • BP: Planning a QCFractal release+upgrade tomorrow

    • DD: Thanks for this. And thanks JHorton for opening qcelemental PR.

    • JH: You’re welcome. BP did a lot of the work on the PR.

    • BP: Conda env solve times have been pretty large too.

    • JH: Switching to mamba shaved off 2 hours from 5 hour total test time.

    • BP: Old managers should still work. Change should be backwards-compatible.

    • DD: It’s not hard for us to update workers.

    • BP: I don’t think it will be necessary, but it can’t hurt. I’ll let you know on Slack. Could pin QCEngine 0.20 and QCElemental 0.23(.1?)

      • steal pins from QCFractal

  • BP – Is anyone using Parsl to manage qcf workers? They’re up for renewal and want to report usage stats.

    • (General) – Basically everyone uses “pool”

Compute status

David

  • DD – All the worker jobs on Lilac consumed. Just spun up more QM workers on PRP.

  • PB – I could use more XTB and ANI workers

    • DD – I can spin those up. JH, are you running these?

    • JH – Not ANI since they keep running out of memory. But I can run XTB.

    • DD – I will spin up some ANI workers on PRP

  • DD – I recall there were issues with single point calcs that required high memory allocation and manual observation. PB, are those still being a problem?

    • PB – I think we’re in good shape; only about 6 jobs remaining

    • DD – will fire off more high-memory jobs on Lilac

New submissions

 

  • CC – Dipeptide set – I’m waiting for QCSubmit #172 to be merged

    • DD – QCSubmit 172 depends on the new QCF release, should be cut around this weekend. Then we can push out new QCSubmit version.

    • CC – Understood. Once that release happens, I’ll ping you if there are further problems with submitting the dataset

  • DD – Worked with Willa Wang before I was out last week on qca-dataset-submission #235. Tried to pull down results from dataset. Didn’t see wavefunctions in the results. Not sure what’s going on here.

    • JH – psi4#2242 seemed possibly related, but I haven’t looked closely.

    • DD – I looked at that, seemed to find something similar to what Simon found before (no wavefunction data attached)

      • Dataset is OptimizationDataset: OpenFF RESP Polarizability Optimizations v1.1

    • BP: possible e don’t make it into the QCEngine execution for the gradient

    • DD – I can chase this down – It seems like the store_wavefunction keyword isn’t making it all the way down the stack.

    • BP – We need to be careful that we don’t store the wavefunction for every result record. I did an analysis of what takes up space in the database, and a recent deposition of wavefunction info made the db size shoot up.

    • JW – refresh my memory: we were originally thinking let’s store grids, but that would have been way too large; now we’re thinking orbitals and eigenvalues, which are large but not as bad?

      • BP – scale as order N**2 with number of orbitals

    • BP – Could simplify(?) the process of not recording wavefunction for every step by doing a single point calc only at the end of an optimization.

    • DD +BP – (Some question about where the logic would need to live to record only the wavfunction for the final step)

    • DD – Best thing to do now would be to take the final structures of WWAng’s optimizations and submit single points for all of them

  • PB – may submit another dataset soon; will be a single point dataset

  • JW – saw that JH had opened a PR for turning an OpenFF topology into SMILES:

    • might introduce a regression fairly soon, with multi-component topology generation from multiple SMILES

    • JH – I do depend on this, so might need to have a discussion on implementation

ML dataset for OpenMM

Pavan

  • PB – Had an issue with B93 functional where it didn’t have dispersion. DD and BP put in a fix for this and so I’m just waiting on a release. JH’s upcoming release should solve this.

Action items

@Joshua Horton will spin up some xtb workers on Newcastle HPC
@David Dotson will spin up some ANI workers on PRP
@David Dotson will fire off more high-memory jobs on Lilac
@David Dotson will pull openff-qcsubmit#172 to completion, given new QCFractal release
@David Dotson will prepare a single-points dataset using the final geometries of Willa’s optimization dataset
@David Dotson will draft implementation of wavefunction retention for final geometry ResultRecord
@Jeffrey Wagner will follow-up with @Joshua Horton on openff-toolkit#1086
@Pavan Behara will draft first OpenMM ML dataset submission; reach out to group if issues encountered

Decisions