Non-bonded optimization

Driver

Approver

Contributors

Stakeholder

Driver

Approver

Contributors

Stakeholder

@Simon Boothroyd

 

@Owen Madin

@David Mobley @Michael Shirts @Michael Gilson @Lee-Ping Wang @John Chodera @Karmen Condic-Jurkic

Objective

Develop a plan for a set of studies exploring which data should go into the next main OpenFF refit.

Due date

N / A

Status

Evolving

Project Timeline

The tentative timeline for refitting is to spend:

  • February and March performing feasibility studies to determine which data, and what composition of data will go into the the next major non-bonded optimisations.

  • April performing the refit including data which the feasibility studies highlight as being good targets. This will be dependent however on the timeline of Team QM.

Brainstorming

All rough ideas for the properties which should be trained / tested against should be sketched out in the https://openforcefield.atlassian.net/wiki/spaces/FF/pages/180256852/Brainstorming page, before being translated into a feasibility study in the table below with a corresponding project page which serves as that studies plan.

Feasibility Studies

High Priority:

Study

Status

Project Page

Study

Status

Project Page

Optimisation with / without $H_{vap}$ while including mixture data.

in progress

https://openforcefield.atlassian.net/wiki/spaces/FF/pages/122454022/Binary+Mixture+Data+Feasibility+Study

Optimization against $V_{excess}(x) or $\rho(x)

in progress

https://openforcefield.atlassian.net/wiki/spaces/FF/pages/122454022/Binary+Mixture+Data+Feasibility+Study

How much pure data do we need to include to get good parameterization/Can mixture data alone get energetics right?

in progress

https://openforcefield.atlassian.net/wiki/spaces/FF/pages/122454022/Binary+Mixture+Data+Feasibility+Study

For mixture properties - does including aqueous mixture data in addition to non-aqueous mixture data improve the overall force field performance (i.e. should we only include non-aqueous data to avoid trying to correct for the water model?).

Unplanned

 

TIP3P vs TIP3P-FB

Unplanned

 

How affected are low-representation molecules by the amount of data?  Is there regularization that we need?

Unplanned

 

How reproducible are ForceBalance optimizations with physical properties as targets? If there is an effect, does this effect scale with the number of targets?

Unplanned

 

Medium Priority:

Study

Status

Project Page

Study

Status

Project Page

How do density and hvap individually constraint the parameters? 

Unplanned

 

Do OpenEye and RDKit produce AM1-BCC charges which are identical / close enough to not drastically change the FF performance?

Unplanned

 

Low Priority:

Study

Status

Project Page

Study

Status

Project Page

What is the difference in fitting between a set where some molecules have only hvap and others have only density, versus a set where all molecules have both hvap and density?

Unplanned

 

Covariance of parameters?

Unplanned

 

A trial run with a sharply reduced set of LJ types? This could reduce the amount of exptl data needed, and Michael Schauperl's results indicate one can do quite nicely with fewer.

UNPLANNED

 

Explore the change in performance with a split parameter  ([#1:1]-[#6X4]-[#6X3]=[#7,#8]), which was identified as a target for splitting in https://openforcefield.atlassian.net/wiki/spaces/FF/pages/122454022/Binary+Mixture+Data+Feasibility+Study

Unplanned

 

Actions

 

Assignee

 

Assignee

Coordinate with Team QM (@Lee-Ping Wang@Hyesu Jang ) on the timeline for the full FF refit.

@Simon Boothroyd