/
2020-10-30 QCA Submission Meeting notes

2020-10-30 QCA Submission Meeting notes

Date

Oct 30, 2020

Participants

  • @David Dotson

  • @Trevor Gokey

  • @Joshua Horton

  • @Pavan Behara

  • Ben Pritchard

  • @Simon Boothroyd

Goals

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

OpenFF BCC Refit

Simon

  • Simon will add a directory of SDFs written out from the OpenFF Molecules used for the dataset, tar+gz it

  • TG: should we include a default spec?

  • SB: this is point calculations only

  • TG: different workflow than we’ve been doing; is there a preference to use the omega stuff?

  • SB: reason we use omega, we want openeye’s ELF10 implementation; rdkit doesn’t have this

    • would have to roll your own if you want to use rdkit for this

Protein peptide fragments constrained

Josh+Trevor

  • Trevor: would like to be able to rename datasets; is that possible?

    • BP: if you pull down the dataset and resubmit, might work

    • TG: “OpenFF Protein Peptide Fragments constrained” v1.0 is good to go name-wise

  • TG: Do we know what the rationale for the dihedrals DC wanted constrained?

    • JH: Major torsions in the three residues

(ANI, ANI1Cxx), ANI2x submissions

Josh

  • JH: Are we even sure this is going to work, even with 3000 iterations?

    • Still seeing some that don’t finish running locally

  • TG: will put together ANI2x supplemental compute

  • JH: will do ANI, ANI1cxx

Enamine REAL subset

Trevor

  • TG: not a submission yet, more of a toolkit-testing side project at this time

Unconstrained protein fragments

David

  • TG: rename to “OpenFF Protein Peptide Fragments unconstrained v1.0”

  • DD: will rename today

PEPCONF OptimizationDataSet

David

  • TG: should follow the PhAlkEthOh pipeline, so can use that for dataset.json preparation

STANDARDS-based versioning

Trevor

  • TG: working on this, want to separate mentally what it means to have a dataset standard vs. a FF-fitting standard

    • still have two weeks, feeling on track

INDEX

Trevor

  • TG: I’ll review

  • JH: will add the script used to generate the table

  • DD: can add a GH action in a future PR to update at any frequency, commit to master

Ambertools pinning

David

  • DD: does this fix our problems on Linux?

  • TG: better to pin it than not to pin it with openmm+psi4

ESPs

Trevor

  • TG: Use of AO density?

    • just sent a link of the wavefunction properties, have a script from Daniel Smith, but wasn’t sure if it generalizes to our problem

  • JH: as of the last week, psi4 can save different parts of the wavefunction, “everything” option now works

    • will need to update psi4 in all managers to support this

  • JH: will check for the “everything” wavefunction support in latest OSX conda package for psi4

  • TG: Submitting the dataset from SB will give us a way to try out ESP reconstitution from stored wavefunction components

QCFractal advances

Ben

  • Cleaning up databases, removing columns that aren’t used, etc.

    • still planning release next week; downtime up to 6 hours

    • everyone will need to update their managers

  • Survey feedback: QCF is slow

    • considering putting a profiler in place on the server

  • Should be a fix for multiple managers pulling down the same task

    • no easy way to test, but we’ll see if problem disappears or decreases in frequency

  • Adding STDOUT from TorsionDrive service to the TorsionDrive record itself

  • TG: still getting stale bombs

    • BP: initial thought for solution doesn’t help; could increase limit again; perhaps improve manager logic so it separates out bad results submission and tries to submit on its own

  • BP: size of the body is really the problem

    • gets managers stuck; breaking up the results may be the answer, but would need to be engineered by us if not handleable by the choice of web server

    • BP: some are over 200MB

      • current limit is 175MB

  • BP: I’ll increase the limit; may start seeing timeouts more often though

PCM

Josh

  • Almost have PCM support in QCSubmit; need to add validations yet

  • For Simon’s dataset, we’ll probably want to add a supplemental compute that uses PCM

Action items

@Trevor Gokey will submit the new constrained peptide fragments from JH
@Trevor Gokey will submit the theory benchmarking sets from HJ
@Trevor Gokey will submit the new ANI2 spec to the ligand fragment dataset
@Trevor Gokey will submit the BCC refit dataset from SB (wait for JH feedback)
@Trevor Gokey will review the new index PR
@Trevor Gokey will review the AT pin issue
@Simon Boothroyd will add a directory of SDF files serialized from OpenFF molecules for the OpenFF BCC Refit submission, tar+gz compressed
@David Dotson will reset the name of the unconstrained protein fragments dataset for consistency with the constrained version
@Joshua Horton will create supplemental compute for benchmark ligands using ANI, ANI1cxx
@David Dotson will develop John’s PEPCONF OptimizationDataset submission using Trevor’s PhAlkEthOH pipeline
@Joshua Horton will check if “everything” wavefunction support in latest OSX conda package for psi4
Ben Pritchard will increase HTTP body limit beyond 175MB; observe if this reduces frequency of issues with large results from managers
@Joshua Horton will finish adding PCM support for QCSubmit

 

Decisions