2022-11-10 Force Field Release Meeting notes

Date

Nov 10, 2022

Participants

@Pavan Behara
@Christopher Bayly
@Michael Gilson
@Chapin Cavender
@David Mobley
@Diego Nolasco (Deactivated)
@Lily Wang
@Matt Thompson
@Simon Boothroyd
@Trevor Gokey
@Jeffrey Wagner
Tim Bernat
@Michael Shirts
Bill Swope

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
GNN charge model status	@Lily Wang	Recording: https://drive.google.com/file/d/1CcSksh9ovsEIEFvwVQJAMWq7NhBjjyYT/view?usp=sharing LW will post slides here DM – Split by diversity = diverse molecules in both sets, or … LW – For the validation set, I tried to pick a broad set across the whole dataset CBy – When you say “no more than 10 molecules each”, could you have chosen less? Could you have chosen 0? LW – What riniker’s paper did is compute the atom environment for all envs as an atom pair fingerprint. Then they pooled it with different limited pool sizes. CBy – So, if one pool had less than 10 molecules, they’d put them all in? LW – Yes DM – So, if there’s something hard to avoid, like an aromatic carbon ring, that would be included several times in the pools for other groups, right? LW – Yes CBy – Would this be missing any data? Are there some atom environments that were very poorly sampled? LW – Yes, that would be from the initial dataset we looked at. CBy - -With AM1BCC, there’s a whole pathology when you have an internal hydrogen bond, so that’s when ELF is used. Did you do something ELF-like? LW – Kinda… SB – When we picked conformers, we used an ELF method DM – Also worth keeping in mind that this talk is focused on picking the graph net. MS – Is it possible that Hs are weird because they only have one connection to the graph? So they have fewer connections to their environment? LW – Possibly. I followed Riniker’s method pretty closely. I think that should account for diversity MS – Also, if the Hs are off, shouldn’t another atom be off as well? Or maybe it’s offset to other hydrogens in the other direction? LW – The ambertools molecules are still computing, so those may have outliers once the data comes in. SB – (detailed question, see video, around 25 minutes) (poorly performing molecule slide) PB – Did you use the high-energy conformations from SPICE? LW – These just took graphs from SPICE, then recomputed using toolkit CBy – There are a lot of sulfonates/sulfonic acids represented here. LW – I wonder if those are underrepresented in training SB – When I did this originally, I also found sulfur to be really tricky. And that was on the industry test set, not the SPICE set. SB – Also, I found that some of our sets had a lot of sulfonic acids, just to keep sulfonates, and the issue persisted. So I don’t think it was sulfonic acid in the protonations tate causing problems. LW – So you filtered sulfonic acids out of the test set? SB – Yes CBy – I’m also seeing a lot of carboxylic acids instead of carboxylates. That’s a frequent source of user input error if they don’t try to keep things close to physiologic pH CBy – Re: population histograms - The more polar a bond is, the more the effect on polar atoms/hydrogens. So those may be the biggest contributors to the max charge difference plots. CBy – When you have a monopole with a carboxylate (like RCO2-), you can get things pretty wrong as long as the charge is on the C or one of the Os, and it’s fine from an outside perspective. So is the question whather we get the point-centered charge right, or should we do something like look at freesolv, and do PB calculations… then use that as a comparison. CBy – I really like this work, like becoming conformationally independent I know a lot about the systematic deficiencies in AM1BCC - They’ll underestimate aldehydes, high-valent sulfurs. So training to AM1BCC is great, but it has these known defects. So in the long run we may benefit from training to an ESP-like model. I think fitting directly to electrostatic potentials would avoid a lot of the numerical instabilities/problems that charge method training usually encounters. Because we use AM1BCC right now, the shortest gap is to produce a replacement that performs the same (and that’s what would be made here). The better thing would be to train to ESPs. When I look at how the hidden layers talk to each other, I start to worry about the effects of delocalization. PB made a dataset that looks at aromatic groups with electron donating/withdrawing substituents. So AM1 can account for delcalization. Can the GNN do this? Does it? LW – The architecture can. I haven’t probed this question directly. So I could run on PB’s dataset as a test. PB – Go for it, it’s the “aniline dataset” https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2021-04-02-OpenFF-Aniline-Para-Opt-v1.0 SB – Some trickiness there. For 1, you can only look so many neighbors out. So we by default look 4 bonds out, which can miss things. SB – But I think LW is looking directly at some other delocalized situations, that should be informative CBy – Para-substituted benzenes are the most direct test. SB – Direct ESP fitting: That was looked at previously, but the computational expense to generate the training set was quite high. So when we trained+tested on small fragments the performance was quite bad. SB – What you’d kinda think is that the NN would learn a distinction between buried and unburied atoms, but it didn’t seem to do that in my case. SB – In terms of architectures, edge+global features would be really useful, and especially giving total charge would be cool. There’s a recent paper in JCIM where they looked at solvation free energies using NNs, and it looked a lot like what we’re doing, so that may be cool to look at. There’s also a model recently released by twitter that improves on stable diffusion, and they claim it can look 10 hops, so that would be really promising. MS – In terms of distance, we’ll eventually want to capture conjugation effects. So I wonder if there’s some way to include some long-distance features but not others. Maybe it would start by designing a dataset to capture this. SB – I think one way to do this would be to include specific features that capture this - For example we enumerate resonance forms and average formal charge. But the issue here is that resonance enumeration is combinatorial, so while I have some tricks to reduce this, but it still hits limits in big systems. CBy – Could we look at atoms and say “oh, this is conjugated”. So maybe each atom could have some opinion on whether it’s conjugated, and is electron donating/withdrawing. LW – I think this is where edge features would come in really handy, though calculating partial bond orders is a slow thing unto itself. SB – The MGilson method may capture this CBy – Maybe aromaticity could fill this gap LW – Agree - Fitting to ESPs is a good ultimate goal. But as an intermediate goal, training to AM1BCC is a good bridge CBy – “Blocker to rolling this out” – Shouldn’t we look at how these charges perform on coulombic interactions? Like, look at ddEs compared to AM1BCC? DM – Maybe not “the same”, but “as good” JW – Agree MS – How does runtime look? LW – Much faster than AM1BCC, probably a fraction of a second for even “big” small molecules DM – Oh, potential legal problem - We may need to get approval from OpenEye since we’re using their data as a reference. I’ll reach out to Ant and Jeffs. JW – Can we enumerate problems with rolling this out to a FF? DM – Sulfur containing molecules - JW – Troublesome hydrogens? DM – Legal rights for training MT – Ensure that an charge assignment engine can be packaged. CBy – Runtime may be an issue here. JW – Could we cut down runtime by using a lighter-weight ML model application engine (lighter than pytorch) SB – I think it should be possible. Also look out for blockers related to DGL getting onto conda forge LW – Are the people with power interested in getting DGL on conda forge? SB – A lot of the issue is that DGL vendors a lot of subpackages that need to get ported to c-f. During my time the issue was metis. Now it may be something else. MT – I think we shouldn’t rely on DGL getting ported, try to find a slimmed-down pytorch that we can use today. MT – Needs to be a SMIRNOFF EP. JW – I don’t think it needs to be put in the SMIRNOFF spec - It’ll just be a piece of software that says “I can provide AM1 charges” JW – Would be good to put an entire protein through to check runtime and librarycharge-like outcome MS – Crosslinked polymer that’s 100k atoms? CBy – Maybe a big conjugated polymer? SB – I did GLUx300 earlier, this took 5-10 seconds, most of it was resonance enumeration. SB – Guardrails on really weird inputs - Like checking for hydrogens with a +1.01 charge, and issueing a warning MS – Maybe comparison to gasteiger? CBy - MMFF isn’t too bad either. JW – I think RDKit and OpenEye offer both. LW – That’s a good idea. But running MMFF on a big molecule using my macbook. SB – Even a population analysis would be good. JW – We should reach out to Yuanqing and see how we can align efforts with espaloma. JW – Other datasets? SB – Chodera lab did NCI AM1BCC, maybe some others DM – Maybe Enamine? SB – Enamine technically says “dont pull all our stuff down”, but we emailed them and they said it’s fine. I’ll forward you the email.

2022-11-10 Force Field Release Meeting notes

Date

Participants

Discussion topics

Action items

Decisions