Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Interface

Command-line interface executable from any shell preferable.

Workflow components

Each workflow component from the diagram is numbered below.
Options for software components indicated for each.

Separability of required workflow components will allow for parallelism in development activity.

  1. Identifier assignment

    • new, include in benchmarking library

  2. Conformer generation

    • openff-toolkit{rdkit}

  3. Parameterization of molecules

    • openff-toolkit{rdkit}

  4. FF coverage report

    • Reach out to Trevor, Jessica, Pavan for existing implementations

  5. Energy minimization with Psi4 (QM), OpenMM (MM)

    • multiple options

      • QCSubmit->QCFractal(->QCEngine->GeomeTRIC->QCEngine->Psi4/OpenMM)

      • QCEngine->GeomeTRIC->QCEngine->Psi4/OpenMM

      • GeomeTRIC->QCEngine->Psi4/OpenMM

    • each option requires different considerations for deployment on queueing systems

      • simpler in terms of components may require additional development for deployment

  6. Analysis and report generation

    • can use components from benchmarkff; need to extract and fold into benchmarking library

    • no matter the approach chosen for optimizations in (5), we will need extraction tooling for flat file output, reports

Available software components for implementation

  1. QCSubmit

    • encoder of OpenFF's preferences for dataset submissions to QCArchive

    • no compute on its own; requires use of QCFractal if part of workflow

  2. QCFractal

    • client+worker+server for executing and storing procedures, such as optimizations

    • perhaps not strictly necessary, but may still be easiest path

    • complex solution may present failure modes that we have a hard time pinning down

  3. QCEngine {vital}

    • features wrapper procedure to GeomeTRIC taking as input QCElemental.OptimizationInput

    • no need for QCFractal

    • not certain of value-add vs. GeomeTRIC directly, unless simplifies input

  4. GeomeTRIC {vital}

    • optimization protocol

    • can use QCEngine internally to optimize using gradients from a variety of programs (engines)

  5. benchmarkff

    • evaluation analyses high value

    • not currently installable as a package; only scripts/notebooks

    • will likely pull functionality out and create infrastructure home in openff-benchmark

  6. openff-toolkit {vital}

    • required for parameterization of molecules for OpenFF forcefields

  7. openmmforcefields

    • required for GAFF, but also usable as abstraction layer for OpenFF forcefields, others

    • used in QCEngine for OpenMM execution

  8. openff-spellbook

    • nouveau functionality for working with QCArchive data; utility functions in service to Trevor Gokey's research and work

    • possible to pull some prototype functionality we don't have in an infrastructure package

Restricted components

  1. OpenEye Toolkit

    • cannot use for this purpose; must not be necessary for any part of the workflow

Packaging Options

openff-cli

Could introduce an entrypoint in this package for distribution.

openff-benchmark

Currently abandoned; may be the perfect place for this right now.
Doesn't feature anything at this time.
Bit of a clean slate.

Minimization Execution

Proposing a three-pronged approach.

  1. High-throughput (primary)

    • QCSubmit->QCFractal(->QCEngine->GeomeTRIC->QCEngine->Psi4/OpenMM)

    • output extraction executable at any time for pulling available data

  2. High-throughput debug approach (secondary)

    • Trevor's local optimization executor

    • components shared with (3)

    • GeomeTRIC->QCEngine->Psi4/OpenMM

    • output still usable for reporting

  3. Fully-local execution (alternative)

    • Like Horton's local TorsionDrive script, minus QCFractal execution if possible

    • components shared with (2)

    • GeomeTRIC->QCEngine->Psi4/OpenMM

    • output still usable for reporting

In principle, (2) and (3) could be served via the same entrypoint.
(1) would make use of QCFractal with a persistent server to handle most of the compute orchestration.

  • No labels