2022-11-10 Force Field Release Meeting notes

 

 

 Date

Nov 10, 2022

 Participants

  • @Pavan Behara

  • @Christopher Bayly

  • @Michael Gilson

  • @Chapin Cavender

  • @David Mobley

  • @Diego Nolasco (Deactivated)

  • @Lily Wang

  • @Matt Thompson

  • @Simon Boothroyd

  • @Trevor Gokey

  • @Jeffrey Wagner

  • Tim Bernat

  • @Michael Shirts

  • Bill Swope

 Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

GNN charge model status

@Lily Wang

  • Recording: https://drive.google.com/file/d/1CcSksh9ovsEIEFvwVQJAMWq7NhBjjyYT/view?usp=sharing

  • LW will post slides here

  • DM – Split by diversity = diverse molecules in both sets, or …

    • LW – For the validation set, I tried to pick a broad set across the whole dataset

  • CBy – When you say “no more than 10 molecules each”, could you have chosen less? Could you have chosen 0?

    • LW – What riniker’s paper did is compute the atom environment for all envs as an atom pair fingerprint. Then they pooled it with different limited pool sizes.

    • CBy – So, if one pool had less than 10 molecules, they’d put them all in?

      • LW – Yes

    • DM – So, if there’s something hard to avoid, like an aromatic carbon ring, that would be included several times in the pools for other groups, right?

      • LW – Yes

    • CBy – Would this be missing any data? Are there some atom environments that were very poorly sampled?

      • LW – Yes, that would be from the initial dataset we looked at.

  • CBy - -With AM1BCC, there’s a whole pathology when you have an internal hydrogen bond, so that’s when ELF is used. Did you do something ELF-like?

    • LW – Kinda…

    • SB – When we picked conformers, we used an ELF method

    • DM – Also worth keeping in mind that this talk is focused on picking the graph net.

  • MS – Is it possible that Hs are weird because they only have one connection to the graph? So they have fewer connections to their environment?

    • LW – Possibly. I followed Riniker’s method pretty closely. I think that should account for diversity

  • MS – Also, if the Hs are off, shouldn’t another atom be off as well? Or maybe it’s offset to other hydrogens in the other direction?

    • LW – The ambertools molecules are still computing, so those may have outliers once the data comes in.

  • SB – (detailed question, see video, around 25 minutes)

  • (poorly performing molecule slide)

    • PB – Did you use the high-energy conformations from SPICE?

      • LW – These just took graphs from SPICE, then recomputed using toolkit

    • CBy – There are a lot of sulfonates/sulfonic acids represented here.

      • LW – I wonder if those are underrepresented in training

      • SB – When I did this originally, I also found sulfur to be really tricky. And that was on the industry test set, not the SPICE set.

      • SB – Also, I found that some of our sets had a lot of sulfonic acids, just to keep sulfonates, and the issue persisted. So I don’t think it was sulfonic acid in the protonations tate causing problems.

      • LW – So you filtered sulfonic acids out of the test set?

      • SB – Yes

      • CBy – I’m also seeing a lot of carboxylic acids instead of carboxylates. That’s a frequent source of user input error if they don’t try to keep things close to physiologic pH

    • CBy – Re: population histograms - The more polar a bond is, the more the effect on polar atoms/hydrogens. So those may be the biggest contributors to the max charge difference plots.

    • CBy – When you have a monopole with a carboxylate (like RCO2-), you can get things pretty wrong as long as the charge is on the C or one of the Os, and it’s fine from an outside perspective. So is the question whather we get the point-centered charge right, or should we do something like look at freesolv, and do PB calculations… then use that as a comparison.

  • CBy –

    • I really like this work, like becoming conformationally independent

    • I know a lot about the systematic deficiencies in AM1BCC - They’ll underestimate aldehydes, high-valent sulfurs. So training to AM1BCC is great, but it has these known defects. So in the long run we may benefit from training to an ESP-like model.

    • I think fitting directly to electrostatic potentials would avoid a lot of the numerical instabilities/problems that charge method training usually encounters.

    • Because we use AM1BCC right now, the shortest gap is to produce a replacement that performs the same (and that’s what would be made here). The better thing would be to train to ESPs.

    • When I look at how the hidden layers talk to each other, I start to worry about the effects of delocalization. PB made a dataset that looks at aromatic groups with electron donating/withdrawing substituents. So AM1 can account for delcalization. Can the GNN do this? Does it?

      • LW – The architecture can. I haven’t probed this question directly. So I could run on PB’s dataset as a test.

      • PB – Go for it, it’s the “aniline dataset”

      • https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2021-04-02-OpenFF-Aniline-Para-Opt-v1.0

      • SB – Some trickiness there. For 1, you can only look so many neighbors out. So we by default look 4 bonds out, which can miss things.

      • SB – But I think LW is looking directly at some other delocalized situations, that should be informative

      • CBy – Para-substituted benzenes are the most direct test.

      • SB – Direct ESP fitting: That was looked at previously, but the computational expense to generate the training set was quite high. So when we trained+tested on small fragments the performance was quite bad.

      • SB – What you’d kinda think is that the NN would learn a distinction between buried and unburied atoms, but it didn’t seem to do that in my case.

      • SB – In terms of architectures, edge+global features would be really useful, and especially giving total charge would be cool. There’s a recent paper in JCIM where they looked at solvation free energies using NNs, and it looked a lot like what we’re doing, so that may be cool to look at. There’s also a model recently released by twitter that improves on stable diffusion, and they claim it can look 10 hops, so that would be really promising.

      • MS – In terms of distance, we’ll eventually want to capture conjugation effects. So I wonder if there’s some way to include some long-distance features but not others. Maybe it would start by designing a dataset to capture this.

      • SB – I think one way to do this would be to include specific features that capture this - For example we enumerate resonance forms and average formal charge. But the issue here is that resonance enumeration is combinatorial, so while I have some tricks to reduce this, but it still hits limits in big systems.

      • CBy – Could we look at atoms and say “oh, this is conjugated”. So maybe each atom could have some opinion on whether it’s conjugated, and is electron donating/withdrawing.

      • LW – I think this is where edge features would come in really handy, though calculating partial bond orders is a slow thing unto itself.

      • SB – The MGilson method may capture this

      • CBy – Maybe aromaticity could fill this gap

  • LW – Agree - Fitting to ESPs is a good ultimate goal. But as an intermediate goal, training to AM1BCC is a good bridge

  • CBy – “Blocker to rolling this out” – Shouldn’t we look at how these charges perform on coulombic interactions? Like, look at ddEs compared to AM1BCC?

    • DM – Maybe not “the same”, but “as good”

    • JW – Agree

  • MS – How does runtime look?

    • LW – Much faster than AM1BCC, probably a fraction of a second for even “big” small molecules

  • DM – Oh, potential legal problem - We may need to get approval from OpenEye since we’re using their data as a reference. I’ll reach out to Ant and Jeffs.

  • JW – Can we enumerate problems with rolling this out to a FF?

    • DM – Sulfur containing molecules -

    • JW – Troublesome hydrogens?

    • DM – Legal rights for training

    • MT – Ensure that an charge assignment engine can be packaged.

      • CBy – Runtime may be an issue here.

      • JW – Could we cut down runtime by using a lighter-weight ML model application engine (lighter than pytorch)

      • SB – I think it should be possible. Also look out for blockers related to DGL getting onto conda forge

      • LW – Are the people with power interested in getting DGL on conda forge?

      • SB – A lot of the issue is that DGL vendors a lot of subpackages that need to get ported to c-f. During my time the issue was metis. Now it may be something else.

      • MT – I think we shouldn’t rely on DGL getting ported, try to find a slimmed-down pytorch that we can use today.

    • MT – Needs to be a SMIRNOFF EP.

      • JW – I don’t think it needs to be put in the SMIRNOFF spec - It’ll just be a piece of software that says “I can provide AM1 charges”

    • JW – Would be good to put an entire protein through to check runtime and librarycharge-like outcome

    • MS – Crosslinked polymer that’s 100k atoms?

      • CBy – Maybe a big conjugated polymer?

      • SB – I did GLUx300 earlier, this took 5-10 seconds, most of it was resonance enumeration.

    • SB – Guardrails on really weird inputs - Like checking for hydrogens with a +1.01 charge, and issueing a warning

      • MS – Maybe comparison to gasteiger?

      • CBy - MMFF isn’t too bad either.

      • JW – I think RDKit and OpenEye offer both.

      • LW – That’s a good idea. But running MMFF on a big molecule using my macbook.

      • SB – Even a population analysis would be good.

  • JW – We should reach out to Yuanqing and see how we can align efforts with espaloma.

  • JW – Other datasets?

    • SB – Chodera lab did NCI AM1BCC, maybe some others

    • DM – Maybe Enamine?

  • SB – Enamine technically says “dont pull all our stuff down”, but we emailed them and they said it’s fine. I’ll forward you the email.

 Action items

 Decisions