# Non-bonded optimization

 Objective Develop a plan for a set of studies exploring which data should go into the next main OpenFF refit.

## Project Timeline

The tentative timeline for refitting is to spend:

• February and March performing feasibility studies to determine which data, and what composition of data will go into the the next major non-bonded optimisations.

• April performing the refit including data which the feasibility studies highlight as being good targets. This will be dependent however on the timeline of Team QM.

## Brainstorming

All rough ideas for the properties which should be trained / tested against should be sketched out in the https://openforcefield.atlassian.net/wiki/spaces/FF/pages/180256852 page, before being translated into a feasibility study in the table below with a corresponding project page which serves as that studies plan.

## Feasibility Studies

#### High Priority:

Optimisation with / without $H_{vap}$ while including mixture data.

in progress

Optimization against $V_{excess}(x) or$\rho(x)

in progress

https://openforcefield.atlassian.net/wiki/spaces/FF/pages/122454022

How much pure data do we need to include to get good parameterization/Can mixture data alone get energetics right?

in progress

https://openforcefield.atlassian.net/wiki/spaces/FF/pages/122454022

For mixture properties - does including aqueous mixture data in addition to non-aqueous mixture data improve the overall force field performance (i.e. should we only include non-aqueous data to avoid trying to correct for the water model?).

Unplanned

TIP3P vs TIP3P-FB

Unplanned

How affected are low-representation molecules by the amount of data?  Is there regularization that we need?

Unplanned

How reproducible are ForceBalance optimizations with physical properties as targets? If there is an effect, does this effect scale with the number of targets?

Unplanned

#### Medium Priority:

How do density and hvap individually constraint the parameters?

Unplanned

Do OpenEye and RDKit produce AM1-BCC charges which are identical / close enough to not drastically change the FF performance?

Unplanned

#### Low Priority:

What is the difference in fitting between a set where some molecules have only hvap and others have only density, versus a set where all molecules have both hvap and density?

Unplanned

Covariance of parameters?

Unplanned

A trial run with a sharply reduced set of LJ types? This could reduce the amount of exptl data needed, and Michael Schauperl's results indicate one can do quite nicely with fewer.

UNPLANNED

Explore the change in performance with a split parameter  ([#1:1]-[#6X4]-[#6X3]=[#7,#8]), which was identified as a target for splitting in https://openforcefield.atlassian.net/wiki/spaces/FF/pages/122454022

Unplanned

## Actions

Coordinate with Team QM (@Lee-Ping Wang@Hyesu Jang ) on the timeline for the full FF refit.

