Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Interface

...

Discovery

Workflow components

...

Each workflow component from the diagram above is numbered below.
Options for software components indicated for each.

Separability of required workflow components will allow for parallelism in development activity. The

Status
titledev
label on each workflow component indicate qualitative development required for each.
Status
colourRed
titledev
will require the most development, and so should be prioritized.
Status
colourYellow
titledev
may require some development.
Status
colourGreen
titledev
has a well-known and heavily-used software pathway.

  1. Identifier assignment

    Status
    colourYellow
    titledev

    • new, include in benchmarking library

  2. Conformer generation (~10 conformers per molecule)

    Status
    colourGreen
    titledev

    • openff-toolkit{rdkit}

  3. Parameterization of molecules

    Status
    colourGreen
    titledev

    • openff-toolkit{rdkit}

  4. FF coverage report

    Status
    colourYellow
    titledev

    • Reach out to Trevor, Jessica, Pavan for existing implementations

    • QCSubmit can give a list of all parameters used; doesn’t do counts currently, but could be made to

      • we’ll want counts, as this is richer information and allows us to prioritize coverage gaps

  5. Energy minimization with Psi4 (QM), OpenMM (MM)

    Status
    colourRed
    titledev

    • multiple options

      • QCSubmit->QCFractal(->QCEngine->GeomeTRIC->QCEngine->Psi4/OpenMM)

        • this path allows easier extension to torsiondrives; not directly possible without significant development work with other paths

        • orchestration mostly solved in this path compared to the others

      • QCEngine->GeomeTRIC->QCEngine->Psi4/OpenMM

      • GeomeTRIC->QCEngine->Psi4/OpenMM

    • each option requires different considerations for deployment on queueing systems

      • simpler in terms of components may require additional development for deployment

  6. Analysis and report generation

    Status
    colourRed
    titledev

    • can use components from benchmarkff; need to extract and fold into benchmarking library

    • no matter the approach chosen for optimizations in (5), we will need extraction tooling for flat file output, reports

...

  1. QCSubmit

    • encoder of OpenFF's preferences for dataset submissions to QCArchive

    • no compute on its own; requires use of QCFractal if part of workflow

    • important to ensure CMILES metadata in place to allow seamless MM calculations

  2. QCFractal

    • client+worker+server for executing and storing procedures, such as optimizations

    • perhaps not strictly necessary, but may still be easiest path

    • complex solution may present failure modes that we have a hard time pinning down

  3. QCEngine {vital}

    • features wrapper procedure to GeomeTRIC taking as input QCElemental.OptimizationInput

    • no need for QCFractal

    • not certain of value-add vs. GeomeTRIC directly, unless simplifies input

  4. GeomeTRIC {vital}

    • optimization protocol

    • can use QCEngine internally to optimize using gradients from a variety of programs (engines)

  5. benchmarkff

    • evaluation analyses high value

    • not currently installable as a package; only scripts/notebooks

    • dependent on OpenEye Toolkit

    • will likely pull functionality out and create infrastructure home in openff-benchmark

  6. openff-toolkit {vital}

    • required for parameterization of molecules for OpenFF forcefields

  7. openmmforcefields

    • required for GAFF, but also usable as abstraction layer for OpenFF forcefields, others

    • used in QCEngine for OpenMM execution

  8. openff-spellbook

    • nouveau functionality for working with QCArchive data; utility functions in service to Trevor Gokey's research and work

    • possible to pull some prototype functionality we don't have in an infrastructure package

...

  1. OpenEye Toolkit

    • cannot use for this purpose; must not be necessary for any part of the workflow

Packaging Options

openff-benchmark

Library components and entry points can be placed in openff.benchmark.geometry_optimizations.

openff-cli

Could introduce an entrypoint in this package for distribution.

openff-benchmark

Currently abandoned; may be the perfect place for this right now.
Doesn't feature anything at this time.
Bit of a clean slate(optional, and for later)

Proposal

Interface

Command-line interface executable from any shell preferable.

Minimization Execution

Proposing a three-pronged approach.

  1. High-throughput (primary)

    • QCSubmit->QCFractal(->QCEngine->GeomeTRIC->QCEngine->Psi4/OpenMM)

    • output extraction executable at any time for pulling available data

    • need error cycling process

  2. High-throughput debug approach (secondary)

    • Trevor's local optimization executor

      • add this to QCSubmit; generally usable for OpenFF QCArchive users in debugging

    • components shared with (3)

    • GeomeTRIC->QCEngine->Psi4/OpenMM

    • output still usable for reporting

  3. Fully-local execution (alternative)

    • Like Horton's local TorsionDrive script, minus QCFractal execution if possible

    • components shared with (2)

    • GeomeTRIC->QCEngine->Psi4/OpenMM

    • output still usable for reporting

...