OpenFF Science Roadmap 2021

Author: @Simon Boothroyd

This roadmap aims to outline the key scientific goals planned for 2021. Only goals which have people assigned to OR which we have committed to but need people to work on them will be listed here. For a full scientific ‘wish-list’ see the https://openforcefield.atlassian.net/wiki/spaces/PS/pages/423854401.

Important Dates / Timelines

Feature

Communicated timeline

Hard deadline

Feature

Communicated timeline

Hard deadline

Parsley 1.4 / Sage Release

  • LJ Refit

  • WBO Torsions

  • Beta release by early May

  • Looking toward a fuller release candidate early June

N / A

 

 

 

Parsley Release

Outstanding Issues

Issue

Assignee

Status

Comment

Issue

Assignee

Status

Comment

The Parsley 1.3.0 benchmark results have not been broadly shared

@Pavan Behara and / or @Jessica Maat (Deactivated) ?

IN PROGRESS

The benchmarks have been performed but not written up as e.g. a blog post.

Sage Release

Planned Features

Feature

Team

Status

Comment

Feature

Team

Status

Comment

LJ Refit

@Simon Boothroyd @Owen Madin

Ready

Mixture feasibility study complete and show’s improvement after benchmarking, both on mixture properties + solvation free energies.

WBO Torsions

@Jessica Maat (Deactivated) @Pavan Behara @David Mobley

DEFERRED

Initial exploratory fits have been performed but not clear that we have seen improvement.

Expanded QM data set

@David Mobley @Hyesu Jang @Simon Boothroyd

IN PROGRESS

https://openforcefield.atlassian.net/wiki/spaces/FF/pages/1548681282

We should include sulfonic and phosphonic acids

and sulfoximine, sulfonimidamines:

https://www.ncbi.nlm.nih.gov/pubmed/23934828

Change the QM theory level

@Hyesu Jang @Lee-Ping Wang @Pavan Behara

IN PROGRESS

Data should have been generated, but not acted on yet.

 

Are we considering QM theories which are compatible with ANI?

Benchmarking

Data Type

Team

Status

Comment

Data Type

Team

Status

Comment

Repeat Lim + Hahn QM Benchmark

@Pavan Behara ?

UNASSIGNED

Should also include the Open set from Pharma partners?

Free Energy Benchmarks

 

UNASSIGNED

@Simon Boothroyd will sync up with @David Hahn

The will likely be limited to MNSol, FreeSolv (make these relative solubilities?), and TYK2 with Perses.

Outstanding Issues

Issue

Assignee

Status

Comment

Issue

Assignee

Status

Comment

The cis / trans amide issue

@Simon Boothroyd @Hyesu Jang

In PROGRESS

Previous work:

Currently generating new TD data for a diverse amide like set which exercises the parameters @Hyesu Jang identified in previous work as potentially being problematic:

Rosemary Planning

Bio-polymer Features

Planned

Feature

Team

Status

Comment

Feature

Team

Status

Comment

Initial bio-polymer parameter refit - likely will be similar strategy to small molecule but with bio-polymer data / fragments.

Chapin

proposed

Mostly blocked by infrastructure side of things.

Proposed

Feature

Team

Status

Comment

Feature

Team

Status

Comment

Cap QM bio-polymer fragments?

Chapin

proposed

 

Small Molecule Features

Under Investigation

Feature

Team

Status

Comment

Feature

Team

Status

Comment

Refit BCC Parameters

@Simon Boothroyd @Owen Madin

In PROGRESS

 

Virtual Sites

@Simon Boothroyd @Trevor Gokey

In PROGRESS

@Trevor Gokey is currently working on extending openff-recharge to facilitate fitting these.

Potential collaboration with @Danny Cole?

Refit of Select Impropers

@Jessica Maat (Deactivated)

In PROGRESS

An initial study on exploring the WBO dependency of the aniline improper is being explored.

Proposed

Feature

Team

Status

Comment

Feature

Team

Status

Comment

Re-evaluate whether WBO interpolated parameters are now feasible after other FF improvements.

@Pavan Behara

 

 

Fuzzy charges / polarisability?

 

 

 

Reduced LJ Types?

@Michael Gilson @Willa Wang @Tobias Huefner

proposed

 

Outstanding issues

Issue

Assignee

Status

Comment

Issue

Assignee

Status

Comment

The rationale and methodology for selecting the bio-polymer data sets is currently undocumented.

@Michael Shirts

@Michael Shirts will follow up with Dave to get an explanation of the rationale and methodology for why these sets where chosen + how.

These are the selected data sets:
2020-07-06-OpenFF-Protein-Fragments-Initial

2020-07-27-OpenFF-Benchmark-Ligands

2020-08-12-OpenFF-Protein-Fragments-version2

2020-09-16-OpenFF-Protein-Fragments-TorsionDrives

2020-10-27-OpenFF-Protein-Fragments-Unconstrained

2020-10-26-PEPCONF-Optimization

External Collaborations

Issue

External Driver

Status

Comment

Issue

External Driver

Status

Comment

Explore different non-bonded functional forms

@Daniel Cole

 

Being supported in our infrastructure using plugins: Consider rolling this into the toolkit instead?