2020-09-01 QCFractal Users Meeting notes

Date

Sep 1, 2020

Participants

  • @Jeffrey Wagner

  • @David Dotson

  • @Trevor Gokey

  • @Joshua Horton

  • @Matt Thompson

  • @David Cerutti (Deactivated)

Discussion topics

Item

Notes

Item

Notes

Updates from MolSSI

  • BP – Planning on update near end of September. Working on issues regarding stuck jobs.

Manager/Queue status

  • DD – TG added lots of power from UCI, spinning up PRP. Queue is emptying out, so scaling down PRP.

  • DD – Most submissions done:

  • DD – Submitted first ANI2x job last week. Currently 1893 complete, 33k incomplete.

    • JW – Seems slower than expected for an ML model

    • DD – Lots of errors due to unsupported elements. Eg. for 460 torsiondrives, many are impossible. Also no ANI workers on PRP. Also ANI workers are multicapable and coallate with QM.

    • TG – For ANI, I see lots of molecules that won’t run. Lots of constraint errors.

    • JW – Is there a long startup time? Or another factor?

    • TG – Running on single core, each optimization takes ~30 minutes.

  • (General) – There’s room in the QM queue. Room for more work soon.

User questions

  • Wavefunction stuff

    • JH – Made issues to get things implemented in psi4. Functionality is mostly there, just need some API additions/modifications to reach is. Only thing that’s needed is loading wavefn into psi4 from QCSchema.

  • Cerutti sets/submissions

    • DC – Added new JSONs with constraints.

      • JH – I see the constraints are there, but JSON formatting error prevents deserialization. Trailing comma after constraint indices.

      • DC – (Working on fixing)

      • Test for JSON formatting validation:

        • “It may be worth trying to run these through a JSON-parsing utility as a check before committing. Either of these will throw errors if there's something wrong with the JSON:”

          1. python -m json.tool <file.json>

          2. cat <file.json> | jq

    • DC – Working on my sets locally, using non-DZVP basis sets.

    • DC – Looking at time-averaged solvent models. Running locally at Rutgers.

    • DC – Doing ESP calculation using new format. Gathering 5000 points per molecule. Will fit using a RESP/RESP2 model.

  • JW – Is the solvent model we’re looking at (PCM) appropriate for MM work?

    • JH – This was used in the Schauperl/RESP2 paper. So it’s likely appropriate/suitable for parameterizing MM molecules.

  • DD – Re: constraints set, I misspoke – We would want more than constraints, also have PDB for each structure.

    • DC – Can do. What is specification?

    • JH – Would want PDB with same coordinates as JSON molecule, in addition to current contents.

    • DC – Pipeline doesn’t have a clear starting point. I get a PDB from my initial QM optimization, and this is used as the input to QCF submission – Should have identical coordinates to JSON molecule. I will include this PDB for each molecule submitted.

  • TG – Do submissions rely on OpenEye? Since we’re interpreting connectivity from PDB.

    • JH – Need OE to generate CMILES. Could use RDKit but we only allow that more more well-defined input types.

    • JH – Can use SDF or SMILES as a starting point.

    • JW – Per OFFTK issues #511 and #697, it’s safest to write smiles with explicit single and double bonds, and not use aromaticity (lower case letters).

  • MT – BP, possible to do basis_set_exchange pypi release?

    • BP – I can look into this, will require website update so there’s a process to follow.

Action items

Decisions