Geometry Optimization Benchmarking for Industry Partners

Driver

Approver

Contributors

Stakeholders

Driver

Approver

Contributors

Stakeholders

@David Dotson @Joshua Horton

@David Hahn

@Trevor Gokey @Jeffrey Wagner

Gary Tresadern, Industry Partners

Objective

Obtain performance benchmarks of OpenFF forcefields via geometry optimizations of in-house molecules by industry partners.

Due date

Jan 31, 2021

Key outcomes

  • partners can deploy required software with minimal difficulty

  • partners can execute geometry optimizations for the desired list of forcefields on their in-house molecules using their own compute and storage infrastructure

  • geometry optimization results are readily usable to reproduce key figures from the recent benchmarking preprint by Lim et. al.

Status

COMPLETE

Problem Statement

The Open Force Field Initiative’s industry partners are keen to benchmark the performance of the recent OpenFF forcefields on their in-house, proprietary molecules. To do so, they will require minimally-complex tooling to perform geometry optimizations on these molecules at reasonable scale (100 to 1000 molecules) using a variety of force fields. They will also require tooling to reproduce key figures from the Lim et. al. preprint for these data.

The results of these calculations will be compiled together for publication.

Scope

Must have:

  • Each partner should be able to run geometry optimizations on 100 to 1000 proprietary molecules, each with 10 conformers (1,000 - 10,000 optimizations total).

    • Molecules that partners are willing to make public will be submitted to the public QCArchive for hosting/compute

  • The approach cannot be dependent on the OpenEye Toolkit.

  • Each optimization should be identified by a three-letter company code (COM), molecule-index (XXXXX), conformer-index (YY): COM-XXXXX-YY

  • Optimizations should be readily usable to reproduce key figures from the Lim et. al. preprint.

  • The following QM computation specs will be used:

    • default

      • program: psi4

      • method: B3LYP-D3BJ

      • basis: DZVP

  • The following MM force fields shall be used for optimizations with OpenMM:

    • smirnoff99Frosst

    • openff 1.0.0 1.1.(0,1) 1.2.1 (latest), 1.3.0 (upcoming as of 2020.10.22)

    • gaff2.1

  • Results from calculations must be flat files

Nice to have:

  • Coverage reporting (how often each parameter used; fragments that could not be parameterized)

    • Check if this is a must have

  • Energy minimizations with Schrodinger using OPLS3e as a separate software path (cannot be done with QC* software stack

    • need to obtain consent from Schrodinger for publishing; perhaps a partner can negotiate on behalf of the effort?

  • Dipole moment-based comparison for optimized geometries

  • Torsional difference comparison for optimized geometries

Not in scope (for this project; of interest for future projects):

  • Possibility to run torsion drives

  • Bespoke parameterization for molecules that can’t be parameterized by existing OpenFF forcefield, followed by benchmarking (requires injection of new parameters into QC* stack; not currently possible)

  • Benchmarking with MMFF94 and/or cGenFF

Clarifications Needed

  1. What will the input file formats be? Will we enforce 3D SDF for minimal ambiguity?

    1. Yes, enforce 3D SDF to remove possibility for ambiguity in inputs

Execution Workflow

Workflow diagram: https://drive.google.com/file/d/1t3zCYj9gYip5FKrvF3Z0GiVdBMHZxmjr/view

Milestones and deadlines

Milestone

Owner

Deadline/Date

Status

Milestone

Owner

Deadline/Date

Status

Meeting with industry partners; gather comprehensive requirements/desires

@David Dotson

Oct. 23

COMPLETE

Software approach decision settled; ready for execution

@David Dotson

Nov. 1

COMPLETE

Call for public datasets from industry partners

@Gary Tresadern

Nov. 5

COMPLETE

Approach ready for testing by David Hahn, Bill Swope

@David Dotson

Dec. 15

COMPLETE

Protocol feature-complete

@David Dotson

Jan. 15

COMPLETE

Present protocol to partners

@David Dotson

Jan. 22

COMPLETE

Deployment at industry partner sites; protocol burn-in on test set

@David Dotson

Jan. 25

COMPLETE

First production runs at industry partner sites initiated

@David Dotson

Feb. 1 Feb. 5

COMPLETE

Publicly-sharable exports submission open

@David Dotson

Mar. 23 Apr. 9

COMPLETE

Season 1 campaign complete and closed

@David Dotson

Jul. 8

COMPLETE

Results

Results can be retrieved from the OpenFF Public Drop Zone.

Reference materials

Meeting Notes