Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

Info

This scientific roadmap includes the next two planned force field releases and a list of scientific studies which need to be performed in 2020. Each study has a priority assigned to it. This roadmap and priorities will be revised and updated in June 2020.

Force Fields

Upcoming force field versions:

...

The list of scientific studies that which need to be performed in 2020, which will be updated every 3 months, as suggested in the science project management workflow. Each study should be linked to its Confluence page with more information about study design, execution and results. The study design should be submitted before study is about to begin.

...

Study

Priority

Effort

Science dependencies

Infrastructure dependencies

Comment

Start date

End date

Status

Driver/Team

Chemical perception

Addition of new parameters – manually fixing problems

Status
colourRed
titlehigh

Status
colourRed
titlehigh

Made easier by benchmarking dashboard (Optional)

Made easier by benchmarking dashboard (Optional

Status
colourYellow
titlein progress

Hyesu Jang David Mobley Jessica Maat (Deactivated) Victoria Lim (Deactivated)

Automated typing inference from scratch

Status
colourYellow
titleMedium

Status
colourRed
titlehigh

Organise a meeting to coordinate efforts.

Full-time person needed – to be discussed further. Work of Josh Fass (Deactivated) and Tobias Huefner may assist here. Owen Madin interested.

Mixture Properties

Binary Mixture Data Feasibility Study

Status
colourRed
titlehigh

Status
colourYellow
titlein progress

Driver: Simon Boothroyd
Team: Michael Shirts Owen Madin

Non-bonded optimization

Status
colourRed
titlehigh

Status
colourRed
titlehigh

Status
colourYellow
titlein progress

Driver:Simon Boothroyd
Team: Michael Shirts Owen Madin

Chemical potential-like properties

Status
colourYellow
titleMEdium

Non-bonded optimization

Implementation in Evaluator

Need to evaluate the data first (testing needed)

Status
colourYellow
titleIn Progress

Simon Boothroyd Spinoff (student)

Octanol-water partition coefficients

Status
colourYellow
titleMEdium

Implementation in Evaluator

Data needed, harder problem

Status
titlenot started

Simon Boothroyd spinoff (Student)

Data coverage and availability

Status
colourRed
titlehigh

Feasibility studies

Check the available data and identify missing data points. Worry in the future what to do about it. We will use what we have for Sage.

Ongoing

Simon Boothroyd Owen Madin Michael Shirts

QM Data Generation

QM dataset selection (training data)

Status
colourRed
titlehigh

Need to expand to benchmarking set.

Status
colourYellow
titlein progress

David Mobley Jessica Maat (Deactivated) Hyesu Jang

Benchmarking/re-evaluating our choice of QM theory

Status
colourRed
titlehigh

(Optional) QC Dataset submission infrastructure

Test of the whole torsiondrive. Keep within 10-50 torsiondrives. More is better.

Hyesu Jang lead; Lee-Ping Wang

Hyesu Jang also leading molecule set selection with help from Jessica Maat (Deactivated) and Victoria Lim (Deactivated)

Protomer/tautomer enumerated molecules

Status
colourRed
titlehigh

QM level of theory validation (QMLoTV)

Protonation/tautomer enumeration integration (Joshua Horton doing OE version in toolkit; there’s currently no good protonation state enumeration with RDKit – see

Github link macro
linkhttps://github.com/openforcefield/openforcefield/issues/526
)

Data on molecules with nonzero formal charges

Status
colourRed
titlehigh

QM level of theory validation (QMLoTV)

(Optional) QC Dataset submission infrastructure

Enamine REAL fragment coverage

Status
colourYellow
titleMEDIUM

Automated fragmentation integration (Joshua Horton

Ligand Expo fragment coverage

Status
colourYellow
titleMEDIUM

Automated fragmentation integration (Joshua Horton

Ligand Expo has higher priority than Enamine Real.

Richer torsion data for WBO fitting

Status
colourGreen
titleLow

WBO torsion implementation

(person needed to continue work of Chaya Stern (Deactivated) )

Biopolymer data selection (ensure sidechain data is available in QCA)

Status
colourRed
titlehigh

ASAP

David Cerutti (Deactivated)

Biopolymer data computation

Status
colourYellow
titleMEDIUM

(Optional) QC Dataset submission infrastructure

David Cerutti (Deactivated)

More efficient torsion sampling with less grid points during scan

Status
colourGreen
titleLow

Status
colourPurple
titlespinoff

Fitting

Addition of new parameters – manually fixing problems

Status
colourRed
titlehigh

Status
colourRed
titlehigh

Ongoing

Status
colourYellow
titlein progress

Hyesu Jang David Mobley Jessica Maat (Deactivated) Victoria Lim (Deactivated)

LJ refitting (Sage)

Status
colourRed
titlehigh

Non-bonded optimization

Status
colourYellow
titlein progress

Simon Boothroyd and Owen Madin

WBO refitting (Sage)

Status
colourRed
titlehigh

More torsion data

WBO torsion implementation

Implement what Chaya has already done. As soon as infrastructure is ready.

After May meeting

Late 2020 (Sep 2020)

Jessica Maat (Deactivated) Hyesu Jang Someone else to continue where Chaya left it off

BCC refitting

Status
colourRed
titlehigh

LJ refit

Patterns for BCCs; could start with something simple like bond SMARTS.

ChargeIncrementModel implementation (early May)

Person needed (

Status
colourPurple
titlespinoff
)

David Mobley can help

Study how to set prior widths and weights for different sorts of data during FF optimization

Status
colourGreen
titleLow

Lee-Ping Wang Hyesu Jang Spinoff?

Value of data generated “incidentally” during torsiondrive in fitting, e.g. optimization snapshots, gradients, energies (low control over these data points)

Status
colourGreen
titleLow

Some parts of Bespoke workflow

Joshua Horton (question)

Status
colourPurple
titlespinoff

Benchmarking

Small reference system for fast testing of FE infrastructure – 5-10 small reference systems, possibly subset of SAMPL challenges, for comparison of different free energy methods to avoid using large P-L systems for test calculations

Status
colourRed
titlehigh

Status
colourGreen
titleLow

Should use SAMPLing challenge systems plus a couple more similar ones.

ASAP

Status
titleNot started

David Mobley Michael Gilson John Chodera David Hahn – owner

Benchmarking/re-evaluating our choice of QM theory

Status
colourRed
titlehigh

Status
colourYellow
titlemedium

Status
titleNot started

Lee-Ping Wang (question)

CCDC data selection/release

Status
colourGreen
titleLow

Status
colourPurple
titlespinoff

Create a list of tests to judge the “quality” of biopolymer FF with our scientific advisory board

Status
colourYellow
titlemedium

Organise the meeting with our IAB, invite to May meeting

April / May

David Cerutti (Deactivated)

openff-1.2.0 (Parsley) benchmarking

Minor release of Parsley

Benchmarking dashboard

Mid 2020

openff-2.0.0 (Sage) benchmarking

Release of Sage

Benchmarking dashboard

Late 2020

Biopolymers

Which quantum method should we use for biopolymers (should it be the same as small molecules)?

Status
colourYellow
titlemedium

QM benchmarking study

Lee-Ping Wang David Cerutti (Deactivated)

Feasibility/benchmarking studies of torsional CMAPs

Status
colourYellow
titlemedium

After protein FF implementation

CMAP support in OFFTK

David Cerutti (Deactivated)

Feasibility/benchmarking studies of other cross-terms

Status
colourGreen
titleLow

Support for cross-terms in OFFTK

Charges

GCN charge model

Status
colourRed
titlehigh

In a few steps:

  • conda-installable tool to assign charges

  • integration of tool into OFFTK under ChargeIncrementModel keyword (and exposure of relevant keywords)

Status
colourYellow
titlein progress

John Chodera Yuanqing Wang

Off-site charge SMIRKS definition/fitting/benchmarking

Status
colourYellow
titlemedium

Status
colourRed
titlehigh

VirtualSite support in OFFTK

Status
colourPurple
titlespinoff
(but interface with David Cerutti (Deactivated) work?)

Bayesian inference and surrogate modeling

Testing Bayesian inference on an analytical model

Status
colourYellow
titlemedium

Status
colourGreen
titleLow

Status
colourYellow
titlein progress

Owen Madin

Generalizing analytical model for Bayesian inference and testing methods

Status
colourYellow
titlemedium

Status
colourYellow
titlemedium

We don’t need Bayesian framework to work immediately

Status
colourYellow
titlein progress

Simon Boothroyd

Constructing full Bayesian architecture with reweighting and simulation to build surrogate models

Status
colourYellow
titlemedium

Status
colourRed
titlehigh

Analytical Bayesian inference testing

Status
titleNot started

Simon Boothroyd Owen Madin Matt Thompson ?

Automated typing inference from scratch

Status
colourRed
titlehigh

Status
colourRed
titlehigh

Full-time person needed – to be discussed further. Work of Josh Fass (Deactivated) and Tobias Wulsdorf may assist here.

Other

Water co-optimization planning study (to be executed later) – discuss with Lee-Ping Wang

Status
colourGreen
titleLow

Status
colourRed
titlehigh

spinoff

Thinking about metals / ions / salts / ionic liquids

Status
colourGreen
titleLow

Status
colourRed
titlehigh

Owen Madin Matt Thompson spinoff

...