This is a template for OpenFF science and research project plans. Each plan must contain the table below with information about the project driver, approver, key goals, and key metrics. It is recommended to follow the format of the rest of the template as well where possible, and to review the project plan on a periodic basis to track progress.

Driver

Alexandra McIsaac

Approver

Lily Wang Brent Westbrook (Unlicensed)

Contributors

?

Other stakeholders

David Mobley , Michael Gilson , Michael Shirts , Daniel Cole

Objective

A neural network charge model that can assign conformer-independent charges to both small molecules and large systems, at a higher level of theory than AM1BCC

Time frame

?

Key outcomes

A neural network charge model that:

  • Is trained on data with a higher level of QM theory than AM1-BCC, with polarization effects from a solvent model

  • Can accurately assign charges to small molecules and large systems at a reasonable speed

  • Assigns charges that perform better in simulation than AM1-BCC

  • Corrects issues with sulfur and phosphorus charges

A force field incorporating:

  • NAGL2 charges

  • re-trained vdW terms

  • re-trained valence terms

Key metrics

  • Equivalent or better testing error compared to NAGL

  • Improved performance on “real-world” benchmarks compared to NAGL/AM1BCC-ELF10 (e.g. solvation free energies, protein-ligand benchmarks, or other similar targets), especially for hypervalent atoms

Status

GitHub repo

A link to a GitHub repo containing work on the project

Slack channel

https://openforcefieldgroup.slack.com/archives/CDR1P66Q2

Designated meeting

The go-to meeting for discussion and updates about this project

Released force field

The first released force field this work appears in, or N/A if the project is ended due to poor results.

Publication

The publication on the project, if any.

(blue star) Problem Statement and Objective

AM1-BCC charges are trained to reproduce RESP charges, which are calculated at a low level of QM theory (HF/6-31G*) and rely on that theory level’s overpolarization to fortuitously model charge polarization in solution. The level of theory is particularly poorly suited for sulfur and phosphorus, which can be hypervalent, as well as some other functional groups. Additionally, it has been shown that HF/6-31G* does not consistently overpolarize charges by the same amount in every system, and within a given system, it erroneously polarizes both solvent-accessible and buried atoms by the same amount. These issues with polarization become more problematic the larger the simulated system is, causing more problems for large systems than small molecules.

In order to accurately model electrostatics, we wish to train a graph neural network charge model which solves these problems. We will train the GNN to a higher level of QM theory, to more accurately capture the electrostatics of complicated systems like hypervalent atoms. We will model the effects of solvent polarization directly by using a solvent model.

(blue star) Scope

Must have:

  • Required objectives and features of the finished product for release.

Nice to have:

  • Objectives and features that would be nice to include, but should not block release.

Not in scope:

  • Objectives and features we decide we will *not* consider in this project, e.g. because they will be targeted in a future phase of the project.

(blue star) Project Approaches

Use the "Science Project Phase Plan" template to create child pages under this one to document each phase of the project. They will be automatically listed below.

(blue star) References

List all relevant resources for this project (Github repos, other Confluence pages, literature).