/
2021-03-12 QCA Submission Meeting notes

2021-03-12 QCA Submission Meeting notes

Date

Mar 12, 2021

Participants

  • @David Dotson

  • @Trevor Gokey

  • @Joshua Horton

  • @Simon Boothroyd

  • Ben Pritchard

Goals

  • Highest priority

    • path forward for iodine

  • New advancements

    • Priority tags on submissions now declaratively set priority for the whole submission

    • Switching to using openff.toolkit internally in openff-qcsubmit, bespoke-fit

  • New submissions

    • Genentech set 3, torsiondrives #186

    • Benchmark partners submissions

    • Submissions from Hyesu; Hessian datasets - will need some priority

  • Upcoming infrastructure improvements

  • Upcoming science support

  • Larger advances

    • Automated FF coverage gap identification, torsion prioritization, submission generation

    • Benchmarking (dashboard, etc.)

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Iodine

Trevor

  • TG: current thinking: no iodine

    • basis set is fundamentally invalid for Rb and higher

    • Lee-Ping thought we could get away with using ECP

    • but something is missing, and gradients for ECP must be calculated, which is numerically expensive (iterative solver)

    • we’re currently stuck where we began;

    • Susi Lehtola in psi4 community has code that can automatically generate the fitting basis for dzvp for all electron up to iodine

      • but needs testing cycles by Lee-Ping before we can adopt

      • Lee-Ping is engaging directly with Susi, Lori on psi4 slack, working on a path forward

  • SB: is current plan for iodine to switch to the higher basis set being developed by psi4; using autoaux approach?

    • also, iodine not high priority at the moment, so okay if it’s not included in our fitting

  • DD: there is iodine in our FF params yes?

    • TG: there is one iodine-specific parameter

  • DD: is Lee-Ping’s goal to arrive at a new basis set then?

    • TG: I think that’s the impression; this has to end up in mainline psi4, basis-set-exchange

  • BP: one thing I’ve wanted is to be able to use basis-set-exchange inside QCEngine

    • say e.g. bse:auto

    • qcschema can specify basis set you’re using, coefficients per element, per atom

  • JH: with a normal psi4 run, can we get this with just a keyword?

  • BP: Yes, but thinking if we lose something with that path

  • TG: Kinda breaks the model?

  • BP: Your choice of DF basis theoretically doesn’t change the result…well, guess it is kinda like your model

  • JH: the keywords are part of the hash, yes?

    • BP: yes, but fyi we’re moving away from keyword hashing

  • BP: have a PR that might speed up calculation submission (I hope)

    • added an index to column in the database that needed to be searched for every single compute looking for duplicate

    • should now be fast due to hash index in DB

    • DD: should be able to submit #160 finally, even without multiprocessing/multithreading client submission

Test QCA server

Ben

  • test server is almost up; will notify when it’s in place

  • also new release incoming for QCFractal

Genentech set 3

Josh

  • Touching base with Pavan next week; fragmenter issues

  • SB: might be easier to do just a complete rewrite of fragmenter, from what it looks like

    • JH: in talks with Danny Cole for benchmarking fragmentation methods; this could fit the bill for determining if a rewrite diverges in results from the original

    • Also, have a big dataset (benchmark ligands) with 480 fragments that could be a test case; basically, re-create it and observe differences

  • SB: how much does tautomer enumeration factor into fragmentation

    • JH: before qcsubmit used to do much itself, used fragmenter for tautomer enumeration, but now that’s done outside of fragmenter

      • fragmenter use limited to fragmentation itself, capping

  • DD: will touch base with Jeff on setting up a sprint on fragmenter rewrite; needs to be soon (next few weeks)

Other upcoming datasets

Simon

  • will have priority datasets coming through from Hyesu for next FF fit

  • JH: not sure we’ve ever done a Hessian dataset with QCSubmit

    • may want to do a test of this earlier with a smaller dataset

  • SB: is there still a plan for a Hessian dataset? Where we run an optimization, then a single point calculation at the end to get the Hessian? May require a service on the server like torsiondrives

  • JH: may be able to handle this externally to QCFractal if there isn’t any movement on it in the project itself

    • could make this a GHA that operates on a labeled optimization dataset, submits a corresponding hessian dataset that uses the final mols as the objects of the point calculations. Runs repeatedly to get additions to the dataset as optimizations complete in the source dataset.

  • SB: MolSSI hasn’t raised any concerns about wavefunction storage, have they?

    • TG: think it’s okay for single point calculations; we’re also only storing orbitals and eigenvalues, so lightest-weight possible

  • DD: could turn around GHA fairly fast; need a test set to work on

    • SB: Let’s get one from Hyesu; she can probably turn one around quickly that would also be useful for her beyond just testing the approach

Action items

Ben Pritchard will notify #qcfractal folks on test server stand-up
@Joshua Horton will work with @Pavan Behara to address fragmenter issues on qca-dataset-submission#186
@David Dotson will touch base with @Jeffrey Wagner to set up a sprint for addressing fragmenter blocker; this is now a severe blocker for @Joshua Horton's work on e.g. openff-qcsubmit
Ben Pritchard will notify on new release of QCFractal, deployment to production server instance
@David Dotson will attempt submission of qca-dataset-submission#160 once production server upgrade in place with submission performance improvement
@David Dotson will reach out to @Hyesu Jang, ask if she can assemble an optimization dataset that Hessians would be useful for (may already be an existing one); we will use this as the target for testing a Hessian point-calculation submitter mechanism, likely via GHA

Decisions