Project Plan

Project Plan

Driver

 

Approver

@Lily Wang

Contributors

@Simon Boothroyd @Joshua Horton

Informed

@David Mobley @Michael Shirts @Michael Gilson @Daniel Cole

Objective

Train a set of generalised virtual sites for inclusion in a general molecular force field

Due date

Key outcomes

Status

in progress

 

Problem Statement

Atom centred charges do not accurately capture the anisotropy around certain moieties, such as the sigma hole present on halogens. The introduction of off-centre charges (virtual sites) can alleviate this alleviate this issue.

ESP of CBr at 1.4 vdW radius computed using HF/6-31G* exhibiting a sigma hole (left), with a FF with no v-sites showing no anisotropy (middle) and with a FF with a single v-site showing improved anisotropy (right)

Scope

Must have:

  •  

Must have:

  •  

Nice to have:

  •  

Not in scope:

  •  

Workplan

Open Software

The software required to carry out this project spans most of the OpenFF stack given that changing the charge model will likely have large impacts on everything downstream (i.e. vdW, valence, …). It is expected that this project will require maintenance of and extensions to:

OpenFF Recharge

  • Reconstruct ESP and EF data from QCA records

  • Estimate ESP / EF using a FF model.

  • Training BCC and v-site parameters and exporting these into a SMIRNOFF force field

  • Generating RESP charges to serve as a ‘reference model’

  • SMIRKS representation of AM1BCC BCCs

splore

  • Easily visualise data sets of molecules either local or from QCA

molesp

  • Compute / visualise the ESP on the vdW surface of a molecule

OpenFF Evaluator

  • Used to compute the training set of properties while training the vdW parameters

nonbonded

  • Automate the set-up of training the vdW parameters against the phys-prop data

OpenFF Bespokefit

  • Automate the set-up of training the valence parameters against QC data

absolv

  • Benchmark FF with v-sites against solvation / hydration free energy data

Open Data

The project will at minimum need a diverse train and test set of ESP data that is made publicly available via QCA.

Level of theory

It was decided to compute the ESP at the HF/6-31G* level theory as is the current norm in the field. Although not perfect, it is not clear that another candidate that has the right balance of speed to compute and ‘accuracy' (defined in terms of how well do the final charges reproduce properties of interest e.g. Gsolv, Gbind). See

https://openforcefieldgroup.slack.com/archives/CDR1P66Q2/p1635523613025700?thread_ts=1635406892.011100&cid=CDR1P66Q2 for details on this choice.

Generating ESP

The process by which the data is generated is expect to proceed by:

  1. Generate a diverse set of ELF conformers for each molecule in the train (/test) set

  2. Minimize each conformer at HF/6-31G* level theory (no wave functions stored to save storage space on QCA)

  3. Discard any conformers that

    1. Formed h-bonds after minimization

    2. Underwent a connectivity change

  4. Perform a single-point calculation on the minimized conformer to compute and store the wave function

  5. ESP is recomputed from the data on QCA and stored locally using OpenFF recharge

Selecting the ESP train / test set