Why do we need NAGL -LW
Timing: just after NAGL1 force field release?
Draft:
Most terms in a classical MD force field can be fit and applied like templates to molecular substructures, but the electrostatics typically isn’t. Whereas a molecule can be decomposed into groups that each receive the appropriate bonds, angles, torsions, and van der Waals' terms, discretizing the charge distribution across the molecule into fixed partial charges requires, well, the entire molecule. Most force fields in the small molecule and biomolecular space therefore shy away from distributing pre-packaged charges in favour of a charge model for assigning charges on a per-molecule basis. These models are typically based on quantum chemistry methods. The RESP approach, famously, fits charges to the electrostatic potential (ESP) at a grid of points around the molecule (to calculations at HF/6-31G* in the gas phase in the original publication). Other semi-empirical approaches such as AM1-BCC carry out an AM1 calculation and apply empirically-fit bond-charge corrections on top.
There are a couple of issues with QM-based charge models. Firstly, anything that requires a QM calculation is going to choke badly as molecules get larger and larger, requiring tedious solutions such as fragmenting large molecules into smaller pieces, capping the fragments, assigning charges, and uncapping and re-joining into the larger molecule with some method for ensuring an overall integer charge. In fact, its relative efficiency is one of the reasons AM1-BCC became so popular as [an alternative charge model](doi: 10.1063/5.0019056) to RESP in the GAFF force field; it too was fit to reproduce the ESP at HF/6-31G*. However, even semi-empirical methods have their limits; anything approaching the size of a protein is usually a no-go. Protein force fields usually include charges for canonical residues in their parameters, but want to model a covalently-bound ligand or post-translational modification? Back to Frankenstein’s lab you go.
Speed aside, another problem that is becoming increasingly well-characterised is the conformer-dependence of the charges derived from QM-based methods. The electrostatic potential around the surface of a molecule will necessarily vary as the geometry changes, so the charges obtained from RESP-like methods will too. We observe this to a lesser extent with population-charge methods like AM1-BCC as well (although its lower charge dependence is another reason it is a popular alternative). And ok, well, this might be annoying to a force field developer; they’ve just fit a nice tidy potential energy function and here you are using it willy-nilly with all sorts of charges. But how bad is it for *you*?
Turns out it can be pretty bad. Meghan Osato and co-authors have recently released a [PAPER/PRE-PRINT](Evaluating the functional importance of conformer-dependent atomic partial charge assignment ) where they systematically varied several factors that go into the calculation of AM1-BCC molecular charges, and then looked at the effect on absolute hydration free energies. They found that not only did the input conformer cause individual charges to vary by as much as XXX e, but that factors as chemically irrelevant as the *current system load on the machine* could result in differences in the charge output [NB: CONFIRM, FROM VERBAL COMMENTS] by causing a different subroutine to be selected. And these charge differences resulted in fairly significant differences in the hydration free energy; conformers with charge differences as low as 0.026 e resulted in free energy differences of up to 3.3 ± 0.1 kcal/mol.
So what is to be done? Well, one way to ensure consistency is to completely remove conformer dependence by not considering the geometry of the molecule at all when assigning charges. Open Force Field decided to roll its own solution to both the conformer-dependence and speed problems by switching to using a graph neural network for assigning charges, which we’ve code-named NAGL1. Our model does not use any geometry-specific features, and requires only the molecular graph. It’s also pretty fast [insert speed graph below], even compared to more traditional methods such as storing and assigning charges by residue templates.
NAGL1 is fit to reproduce AM1-BCC charges, as that was previously the standard OpenFF charge model. However, NAGL1 should *not* be considered another implementation of AM1-BCC charges. Instead, we have released NAGL1 as the canonical charge model of our newest force field, [WHAT ARE WE GOING TO NAME THIS]. We see the release of NAGL1 as a major improvement not because the charges are amazingly better quality – in fact our goal was that this part should remain the same! – but for the consistency it brings to simulation, and the speed of system set-up. For example, did you know you can now: [link to examples that are hopefully mainline in docs by then!]
assign charges to a polymer/protein in XX seconds
build a PTM simulation system in XX lines of code and YY seconds?
????
We would love for you to try it out and give us any feedback on the impact it made to your workflow, good or bad, because spoiler alert: we’re already well into more improvements with NAGL2.
Outline
electrostatics can’t be given template treatment
Most charge models (i.e. QM based ones) are:
slow
conformation-dependent
In fact Ambertools AM1-BCC can vary a lot (Meghan’s paper)
So it makes it hard to assign charges
consistently
quickly
to large molecules
Enter NAGL, which is
fast
consistent
easy [?]
Charge molecules quickly:
link to example that hopefully exists by then
Assign parameters to arbitrarily large molecules
Assign self-consistent charges to protein and ligand
Text graveyard
Most people who’ve used an classical MD force field will realize that electrostatics are treated a bit differently. Generally a force field contains a bunch of “template” parameters that can be applied transferrably across molecular substructures. A particular kind of carbonyl group, for example, might get particular parameters for the bond length and force constant, and specific Lennard-Jones parameters for the van der Waals', and so on.
Not so for electrostatics. In a classical atom-centered fixed-charge force field each atom in a molecule is assigned a “partial charge”, a fictitious property