Espaloma discussion 10/18/2022

Recording link: Video Conferencing, Web Conferencing, Webinars, Screen Sharing

Attendees:

Yuanqing Wang
David Mobley
Mike Gilson
Michael Shorts
Pavan Behara
Daniel Cole
Chapin Cavendar

Notes:

There was some confusion over the GAFF atom type prediction experiment. Can we explain this experiment graphically in a more straightforward manner?

David Mobley: Pavan and I spent some time (with Chris Bayly and others) going through the largest errors in the VEHICLe set as pointed out by Yuanqing's work and concluded we could fix quite a few of them with some simple changes, but mostly industry feedback was that the "most problematic" ones usually involved chemistries the partners concerned "not drug-like”, e.g. like the compound Yuanqing was showing that looks like an explosive.

Pavan: One point of discussion from earlier slides: I was not expecting the espaloma predicted FF parameters (bond and angle force constants) to be that close to OpenFF (or GAFF), from Figure 3a. I was expecting some variations at least in case of double and single bond force constants since they are way different than QM (or Modified seminario) suggested values in our FFs now.

Yuanqing is experimenting with fitting parts of SPICE right now, but mostly focusing on trying to get fitting to forces to scale to SPICE-size dataset sizes.

David: Mispredicting geometries for druglike compounds is a significant problem for Industry partners. Deviations of minima, weird ring puckering, aromatic X-H bond angles, etc.
David: How do we hunt for areas where we need more training data? What about adversarial learning techniques to identify new QM data points?

Could run dynamics and cross-check energies and prioritize conformers where there are big differences in energy.

John: May need three levels: Large, expensive QM datasets, followed by high-quality ML surrogates of QM (ms/calculation), followed by MM fitting to the ML surrogate

David: XTB is available in QCEngine, and it’s really fast and reasonably high quality
Limmer lab has Python interface (accessed via QCEngine)

Yuanqing: We have a straightforward way to predict uncertainty in espaloma parameters by evaluating with different models, so we can in principle use this uncertainty to drive an active learning scheme on compound sets of interest.

Yuanqing: Lack of topology information in large datasets like ANI are not useful because there is no topology information.

Pavan is writing up benchmarking paper: Current QM level of theory for OpenFF is still the best compromise between speed and accuracy

Michael: How do we fit this into NIH grant proposal?

espaloma/convolutional graph nets could be considered as the evolution of recognizing discrete atom types; SMARTS aren’t specific enough to capture subtle details like Wiberg bond orders or charge information
end-to-end optimizable eliminates intractability of mixed discrete/continuous optimization of typing and parameter assignment, enabling everything to be easily optimized together
enables uncertainty quantification of resulting parameter sets to propagate model uncertainty into parameters
uses modern capabilities of machine learning frameworks

Emphasize that it’s easier to perform experiments with new physical functional forms to quantify which physical models works better and why.

David: OpenFF infrastructure folks have not been able to highly automate benchmarks yet. An NIH grant aim could be to invest more in automated benchmarking infrastructure

Experiments:

“ddE” minimized conformer energy differences: Instead of QM vs MM RMSE for the same snapshots, we should also compute the “ddE energy distribution” where both QM and MM are minimized independently to different conformations in the same basin, and the relative conformer energies are used to quantify the error between QM and MM. Pavan has done some of these comparisons.

Bespokefit paper metrics for torsion profiles (DC - current BespokeFit metrics described in the discussion around Table 2 here: https://doi.org/10.26434/chemrxiv-2022-6h628 ; reviewers have asked us to add an analysis along the lines of Fig 3 here: https://doi.org/10.1021/acs.jcim.1c01346).

MRS notes: use ESPALOMA and Bespoke fit with just torsions / BAT refit giving better free energies as a justification for continuing to improve these terms for NIH grant.

We have not yet compared fitting to TorsionDrives as well as OptimizationDatasets

We are working to fit to forces as well as energies to see if this eliminates some of the deviations we see from QM minimum conformational locations

Have to fiddle with decorators; The one part of the project that the most human effort is invested in; how do we find that. If having it be continuous makes it easier.

JDC: consistent biopolymer and small molecule force field. Covalent ligands easily parameterized. In principle able to do uncertainty propagation.

ML would not have interpretability. Can try class II ff terms. Suddenly very easy to do that. Bolt on a module that can do a polarizability.

DLM: benchmarking reproducability/standardization. (Goal for the grant?)

MRS: POTENTIAL ACTION ITEM: Seems like it would be good as a validity check to see if espaloma can properly predict OpenFF bond/angle/torsion/vdw types, to avoid all the atom type confusion (if we are to say that it’s a natural evolution of chemical environment recognition

OpenFF toolkit examples/ directory has examples of predicting assigned valence types for bonds, angles, torsions. Can we predict these from discrete classifier stage added to end of espaloma Stage 2?

Action items:

Challenges: incorporate other experimental data into this.
- Josh Fass has shown that we can fit to hydration free energies
Josh has done propagation of experimental free energies.
Straightforward.
Test on mixture properties as well.
R