2021-04-28 Benchmarking for Industry Partners - Development Meeting notes

2021-04-28 Benchmarking for Industry Partners - Development Meeting notes

Participants

  • @Jeffrey Wagner

  • @David Dotson

  • @David Hahn

  • @Lorenzo D'Amore

  • @Joshua Horton

Goals

  • Torsiondrive requirements from Mobley, Hahn

  • Problem with filtering via SMIRKS

  • Swope and Lucas analyses

  • Updates from team

Discussion topics

Topic

Presenter

Notes

Topic

Presenter

Notes

Swope and Lucas analyses

Lorenzo

  • LD: working on the analyses desired by Swope and Lucas

    • working with David H., been a big help

    • next steps is plotting and checking results

Torsiondrive requirements from Mobley, Hahn

Lorenzo

  • QM torsiondrive runs crash frequently

    • tried executions on more than one machine, failures appear consistent

  • Would like indices to start from 1; useful for folks

    • DD: implemented, not pushed yet

  • Some problems with frequent crashing

  • LD – There’s an output format that I like – It comes from the torsiondrive package interface. (JH: This is a dump of torsiondrive object’s state).

    • DD – We currently print a JSON object to STDOUT

    • LD – I didn’t see that

    • DD – My mistake – I haven’t merged the change to make this output yet. In the future, I’ll have this dumped into a JSON file. Or, if it would help, I could have each optimization dump out its own JSON.

  • Getting the same number from different FFs

    • LD – I’m seeing the same exact numbers for all the FFs during MM torsion scans

    • (Live debugging)

    • (General) – The problem appears to be in compute.py, where state is being define outside of the loop where the jobs are submitted for each different spec.

    • JH – Would it be easier to move this functionality over to openff-gopt before we merge it to openff-benchmark?

      • DD – This was made a bit specifically for Xavier’s requests in benchmarking.

      • JH – Moving to gopt early would allow for easier support for ANI.

      • LD – My goal is to run torsiondrives on small datasets for benchmarking. So I can use either approach.

      • DH – Also, XL was the only one who wanted to run torsiondrives.

      • JW – May make things a bit more complex to split out gopt this early, since then there will be two versions to keep track of. But if there’s no risk that this will get mixed in with the season 1 benchmarking data, then it should be OK to make this a secondary package.

      • DD - will push my changes, ensure some self-consistency, then start up with openff-gopt

      • DD will set up a new gopt repo using cookiecutter, DD will open two issues for the needed features, and then LD will port code from DD’s PR into PRs in that repo.

      • JH – We should have the torsiondrive executor utilize the optimization executor

    • JH – LD, are you able to run up a QCF server to store results? Or are you just running locally?

      • LD – Currently just running no-server-distributed-worker.

      • DH – We should be able to set up a QCF server, we’ve just chosen not to.

      • JH + LD will meet to set up a local QCF server on LD’s machine.

    • DD – Other requirements?

      • LD – Would be good to have the coordinates of each gradient step.

      • DD – this wll be part of the final output (when I actually get it outputting)

    • LD – I’m interested in getting the LJ component of final geometry. Is this currently calculated somewhere, or would we need to recalculate this after the fact?

      • DD – Currently this isn’t computed anywhere in QCEngine. It may make sense to extend QCVars to hold things like LJ component, coulomb, etc. But that change will need to be in QCEngine.

      • LD + DH – So, unless that’s available, the remaining option is to recalculate it after the final geometry optimization.

Problem filtering via SMIRKS

Jeff

  • JH: think it’s where the the toolkit is trying to make an rdkit molecule?

    • JW: we do this check to make sure that we are getting the stereochemistry right; this must be some exotic chemistry

    • this can happen when we have a molecule with 3 stereocenters

    • we may really need a reproducing case to properly address this

      • JW: I will come up with a reproducing case so we can come up with a tangible fix

    • DD: as a workaround, would it be appropriate to ask Thomas to manually filter the molecule (if he can identify it)?

      • JW: I think so, yes

      • DH: Perhaps also ask him if he can provide a fragment that produces the same error?

  • DD: will reply to Thomas Fox, give him workaround

  • JW: will investigate issue, submit reproducible case to appropriate issue tracker

Team updates

 

  • DH: Schrodinger pathway, to support OPLS4, mainly need to make tool not search for existing parameters

Action items

@David Dotson will wrap up work on torsiondrive executor PR in openff-benchmark, notify for review
@David Dotson will create new openff-gopt repo from cookiecutter
@David Dotson will seed issues for 1) optimization executor, 2) torsiondrive executor. Will assign @Lorenzo D'Amore to (2), work with him on porting and integration from openff-benchmark
@Jeffrey Wagner will investigate SMIRKS inchikey issue for exotic molecule, attempt to create reproducible case and submit to either openff-qcsubmit or openff-toolkit issue tracker
@David Dotson will follow-up with Thomas Fox on workaround, ask if he can provide a fragment that isn’t proprietary that may reproduce the issue
@David Dotson will prototype mmvars or similar in QCEngine OpenMMHarness, with separated energy terms for LJ, other contributions, as well as dipole moments

Decisions

  1. We will proceed with implementation of general-purpose executors for optimizations, torsiondrives in openff-gopt.