Purpose: This page is intended to be used to brainstorm ideas about which data should be included in the training and / or benchmarking of force fields against physical.

All ideas on this page which people wish to be acted upon should be translated into feasibility studies on the main page, with a corresponding project plan page which outlines the rationale of why the study should be done, what the intended outcomes are, the priority of the study, and who intends to conduct the study (see the page as an example).

Reproducibility of ForceBalance runs with phys prop targets

  • Because of the uncertainty inherent in the equilibrium simulations used to compare against phys prop targets, there is a stochastic element in each ForceBalance optimization. Is this element significant enough to make the results of an LJ optimization less reproducible?

  • This effect may be more apparent in optimizations against more targets, since the total amount of uncertainty might be increased.

  • It also may be dependent on the properties that are used as targets, since mixture properties will be noisier than pure properties on average.

  • Good place to start would be with the sets from

Reduce the number of VdW types in the force field:

  • The work by @Michael Gilson and @Michael Schauperl (Deactivated) suggests that it may be possible to reduce the number of VdW LJ types without compromising the accuracy of the force field.

Differentiate LJ types for ketone/amide/imine hydrogens

  • In the mixture feasibili

Pure Property Data which may be used during fitting:

  • Density is a safe option and likely should be included.

  • Hvap may still be included to ensure we retain some information about the cohesive energies.

    • Which and how much Hvap data should be included? 

    • Only ‘non-polar’ molecules (need to define ‘non-polar’ metric + cutoff), the same amount of data as density or less?

    • Should we fit a dataset that is entirely on enthalpies of mixing with no Hvap?

  • No surface tension as it is too much of an unknown to implement / test / include by May. 

    • This will be a good candidate to aim and benchmark against for this round.

    • Is vapor pressure more reliable than Hvap?  Benchmarking? Do we want to use vapor pressure instead of Hvap?

  • No dielectric constant data until we begin to re-fit the electrostatics.

Mixture Property Data which may be used during fitting:

  • The current target candidates are some combination of enthalpy of mixing, binary mass density and excess molar volumes.

  • Should we include aqueous + non-aqueous mixture data, or only non-aqueous (possible feasibility study).

    • The organic/aqueous interface is so import for bio systems that including these data seems like an opportunity to really improve the FF.

  • Should we use TIP3P or TIP3P-FB? Does changing the water model make a noticable difference when fitting aqueous mixture data (possible feasibility study)?

    • If we could do another short test optimization that includes water/organic mixtures, we might get a lot of information quickly about what do do about the water model.

  • What can we use for benchmarking?

  • What partitioning data is there, how is it validated, and is it open?

    • how much to worry about water getting into the organic phase and vice versa? Can we include this in the modelling?

  • Activity coefficients, osmotic coefficients

Molecule Selection

  • Reduce the number of overly halogenated compounds. 

    • How do we define this? Maximum number of halogens per molecule? Maximum relative to the molecule size / their relative positioning in the molecule?

  • More heterocycles?

  • More commonly used compounds such as benzene, ethane, ethanol, etc.?

    • (If we have things that can teach us that one specific parameter is wrong, it will likely help us, whereas if we only have polyfunctional compounds it makes fitting more complex I think.)

  • Try to include as many common solvents from the GRAS list as possible.

Data which may be used for Benchmarking:

Rough targets for set sizes:

  • 200-300 molecules in training?

    • Should investigate holding out training data to ensure generalisation of parameters during optimisation

  • 200-400 for benchmarking. 


  • Surface tension data would be good (will need to figure out system setup issues)

  • Activity coefficients?

  • Vapor pressure (Can we do those calculations robustly)?

  • Host-Guest: Feasibility study?

  • Partitioning data (LogP, …) if can find high quality open data.

  • Endpoint heats of mixing/vapor pressure (i.e properties at the mole fraction extremes). 

  • No: Protein ligand binding?  Would be good to have as more relevant benchmark, but could be expensive/time-consuming

  • Dielectric coefficients (we mostly get this for free anyway from the pure data simulations)

Amber compatibility

Benchmark on systems where one component is side chain analogue parameterised with Amber FF parameters (discuss VdW Refitters (@Simon Boothroyd@Michael Shirts@Owen Madin)).