Infrastructure Roadmap 2022

 

List of infrastructure tasks for 2022. Each task should be linked to its Confluence or GitHub page with more information. See also https://openforcefield.atlassian.net/wiki/spaces/FF/pages/1111556120 .

Labels

Category

Labels

Category

Labels

Priority

high | MEDIUM | LOW

Effort

high | MEDIUM | LOW

Status

Not started | In Progress | PROTOTYPE | Completed | BLOCKED |

Roadmap

Infrastructure tasks

Priority

Effort

Blocking science?

Infrastructure Dependencies

Start date

End/Due date

Status

Driver

Infrastructure tasks

Priority

Effort

Blocking science?

Infrastructure Dependencies

Start date

End/Due date

Status

Driver

Architecture / General infrastructure

95%+ core package uptime and deployment

(OpenFF TK, s99F, OpenFFs)

 

High

High

 

 

Ongoing

 

Completed

@John Chodera @Jeffrey Wagner @Matt Thompson @David Dotson

Add openff-qcsubmit openff-evaluator, bespokefit to core packages

High

 

 

 

 

 

 

 

openff-qcsubmit and bespokefit users guide

MEDIUM

 

 

 

 

 

 

@Josh Mitchell

Refresh , consolidate, and prune OpenFF toolkit examples

MEDIUM

 

 

 

 

 

 

@Josh Mitchell

SQM AM1 optimization connectivity change handling

MEDIUM

 

 

 

 

 

 

@Jeffrey Wagner @Connor Davel

openff-benchmark refactor (make components be Python-first and more modular to enable more flexible workflows)

High

 

 

 

 

 

 

@Jeffrey Wagner

Streamline ForceBalance CI and pre-release testing

MEDIUM

 

 

 

 

 

 

 

ForceBalance developers guide

LOW

 

 

 

 

 

 

 

Refactor espaloma for production use or implement in OFF Toolkit

Unresolved – Don’t know timeline for adoption – Bring up in leadership/gov board meeting. Infra team will look into completeness of tests/reference values to estimate refactor cost

 

 

 

 

 

 

 

Accept, reject, or request specific feedback for SMIRNOFF spec proposals within 4 weeks of submission

High

 

 

 

 

 

 

@Jeffrey Wagner @David Mobley @John Chodera @Simon Boothroyd

Automated upstream RC tests

MEDIUM

 

 

 

 

 

IN PROGRESS

@Matt Thompson

“Did I break something else?” tests against master/main branches of OpenFF packages

MEDIUM

 

 

 

 

 

 

@Matt Thompson @Jeffrey Wagner

QCA Standards v3 implementation

High

 

 

 

 

 

 

new qca hire?

QCA in-server, policy-based error cycling

MEDIUM

 

 

 

 

 

 

 

QCA 2D torsiondrive support (may just need to verify that this works)

High

 

 

 

 

 

 

 

Psi4 to conda-forge

High

 

 

 

 

 

 

 

QCA chained operations

LOW (may increase if found to be blocking)

 

 

 

 

 

 

 

General “reproducible computation” records and data infrastructure

High

 

 

Interoperable molecule class

 

 

 

@Simon Boothroyd @Joshua Horton

Bayesian infrastructure: ML frameworks

 

 

Bayesian Fitting

Analytically Differentiable System Object

 

 

BLOCKED

 

Off-site charges (support for conversion to other packages)

MEDIUM

 

 

Hard to spec without VirtualSite Handler implementation

June 2021

 

IN PROGRESS

@Matt Thompson

Define and maintain specific goals for Bespokefit deployability/stability (succeeds on 95% of minidrugbank? In under some set number CPU-hours? Regression test suite incorporated into CI)

High

 

 

 

 

 

 

@Jeffrey Wagner @Matt Thompson @Joshua Horton

Local torsiondrive executor (default qc, ANI, and XTB - a more formalized version of this command line command in openff-benchmark)

MEDIUM

 

 

 

 

 

 

@David Dotson

Interchange: Have ForceField.create_openmm_system either be deprecated or wrap Interchange call

MEDIUM

 

 

GBSA suport in Interchange

Plugin support in Interchange, and a few months of lead time for scientists who need to port parameterhandler plugins

 

 

 

@Jeffrey Wagner @Matt Thompson

Interchange: System combination

MEDIUM

 

 

 

 

 

PROTOTYPE

@Matt Thompson

Interchange: AMBER export

High

 

 

Biopolymer topologies

SMIRNOFF updates

 

 

IN PROGRESS

@Matt Thompson

Interchange: GROMACS export

High

 

 

Biopolymer topologies

SMIRNOFF updates

 

 

PROTOTYPE

@Matt Thompson

Interchange: OpenMM export

High

 

 

 

 

 

PROTOTYPE

@Matt Thompson

Intercharge: LAMMPS export

LOW

 

 

SMIRNOFF updates

 

 

 

@Matt Thompson

Interchange: ParmEd export

MEDIUM

 

 

Biopolymer topologies

SMIRNOFF updates

 

 

PROTOTYPE (not to be advertised for production use)

@Matt Thompson

Interchange: AMBER import

MEDIUM

 

 

Biopolymer topologies

 

 

IN PROGRESS

@Matt Thompson

Interchange: GROMACS import

LOW

 

 

Biopolymer topologies

 

 

IN PROGRESS

@Matt Thompson

Interchange: OpenMM import

LOW

 

 

Biopolymer topologies

 

 

PROTOTYPE

@Matt Thompson

Interchange: ParmEd import

MEDIUM

 

 

Biopolymer topologies

 

 

PROTOTYPE (not to be advertised for production use)

@Matt Thompson

Interchange: Track parameter provenance on import (ie, hold a single value for shared GAFF parameters)

LOW

 

 

Biopolymer topologies

 

 

 

@Matt Thompson

Interchange: Interfacing with ML-based fitting

 

 

Blocked by lack of specification / needs to be broken into more discrete deliverables

 

 

 

BLOCKED

@Matt Thompson

Interchange: “Book” documentation/user’s guide

MEDIUM

 

 

 

 

 

IN PROGRESS PROTOTYPE

@Josh Mitchell@Matt Thompson

CLI tool infrastructure

HIGH

 

 

 

June 2020

 

IN PROGRESS PROTOTYPE

@Matt Thompson @Jeffrey Wagner

Remove smirnoff_hack.py

MEDIUM

 

 

 

 

 

 

@Jeffrey Wagner

Implement CachingToolkitWrappers

High

 

 

 

 

 

PROTOTYPE IN PROGRESS

@Jeffrey Wagner @Connor Davel

Toolkit

AMBER-derived SMIRNOFF-format FF

HIGH

 

Biopolymer fitting

 

March 2020

 

PROTOTYPE

@Jeffrey Wagner @Chapin Cavender

Polarizability ParameterHandler

LOW

 

Polarizable fitting

 

 

 

 

 

Custom GBSA handler

(Follow up with @Jeffry Setiadi @Michael Gilson to understand long term plans and infrastructure needs)

 

 

 

 

 

PROTOTYPE

 

WBOs for improper torsions

 

 

Waiting on research results to assign priority

 

 

 

 

 

A deep dive into toolkit parametrization differences (Josh Fass SMIRKS differences) / Automate complaining about cases where incoming molecule/chemistry is bad/misformatted

High

 

 

 

 

 

IN PROGRESS

@Connor Davel @Jeffrey Wagner

Refactor/make our own Exception hierarchy, implement some problems as catch-able warnings.

MEDIUM

 

 

 

 

 

IN PROGRESS

@Matt Thompson @Simon Boothroyd @Jeffrey Wagner

Implement friendly default behavior when loading large molecule datasets/high-volume pipelines, with option for custom validation logic. Consider making moleculefixer for common data problems.

MEDIUM

 

 

 

 

 

 

 

openforcefield-core/pydantic refactor (possibly driving a SMIRNOFF spec update)

High

 

 

Aromaticity refactor

Stereochemistry refactor

 

 

 

 

Remove OpenFF-Toolkit’s hard dependency on OpenMM (migrate to pint/openff-units) (patch)

MEDIUM

 

 

 

Sep 2021

0.11.0 release

IN PROGRESS

@Matt Thompson

Protonation state enumeration

LOW

 

 

RDKit doesn’t have helpful protonation state enumeration; need to publicize and see if community wants to contribute there https://github.com/openforcefield/openforcefield/issues/526

Could use EPIK from schrodinger suite? Example in OpenMolTools

Mar 2020

July 2020 (incomplete)

PROTOTYPEBLOCKED

 

Interoperable molecule/stereochemistry/aromaticity refactor

MEDIUM

 

 

Need to decide on desired behavior for how stereochemistry and aromaticity is handled. Also need to decide on which molecule formats should be losslessly round-trippable.

 

 

 

@Jeffrey Wagner

Biopolymer infrastructure (SMARTS typing optimization)

High

 

Biopolymer fitting

 

 

Dec 31 2020

IN PROGRESSPROTOTYPE

@Jeffrey Wagner @Connor Davel

Biopolymer infrastructure (infra improvement/Topology refactor/automated polymer unit recognition)

High

 

Biopolymer fitting

Should discuss design with OpenEye

 

 

IN PROGRESS

@Jeffrey Wagner @Iván Pulido

Biopolymer infrastructure (graph charges and/or other scalable solution)

High

 

Biopolymer charge fitting

 

 

 

 

 

CMAP torsions in OFFTK/SMIRNOFF spec

LOW

 

CMAP fitting

 

 

 

 

 

Fitting

Migrate FF optimization to ML framework

HIGH

 

 

 

 

 

 

 

QM-MM / iPolQ solvent calcs on QCA

MEDIUM

Needs research cycle: Could another QC program offer performant continuum solvent model and be compatible with QCF?

 

 

(Maybe: Way to generate solvent configuration in QCA? Or is this cheap enough to do outside QCA?)

 

 

 

@Chapin Cavender Maybe @new-qca-hire @Trevor Gokey @Jeffrey Wagner

Single-point QM-MM of a subset of packed/folded protein on QCA

LOW

 

 

Need to decide on a QC program and ensure feature support/QCA compatibility)

 

 

 

 

Benchmarking

H-G calculations in OpenFF-Evaluator

Medium

 

 

 

 

 

 

@Jeffry Setiadi

P-L benchmarking (repo)

Medium

 

 

 

Mid 2019?

 

IN PROGRESS PROTOTYPE

@David Hahn@David Dotson

Protein Xtal/NMR observable based benchmarking and fitting (chemical shift/scalar couplings/RDCs/kirkwood-buff integrals/etc) (specific effort to be directed by @Chapin Cavender)

HIGH

 

 

 

 

 

 

@Chapin Cavender @Jeffrey Wagner

PL Benchmarking on Folding@Home (aligning with architecture from bespokefit where possible/appropriate)

HIGH

 

 

 

 

 

 

@David Dotson

Automated benchmarking + dashboard

May include geometry tools (MM minimization, conformer generation, torsion scanning, conformer scoring)

HIGH

 

 

(Optional) Reliable QCMol → OFFMol conversion/CMILES deviation checks

???

 

IN PROGRESS

Dashboard: @Jaime Rodríguez-Guerra (Deactivated) @David Dotson @John Chodera @Trevor Gokey

Documentation / Community / Training

Reference energies data package (curation write-up)

MEDIUM

MEDIUM

 

SMIRNOFF updates

Sep 2021

 

IN PROGRESS

@Matt Thompson

CHARMM-GUI integration / validation

LOW

 

 

A way to create CHARMM residue template files (ParmEd Issue #1103)