2025-07-02 Cole Lab/OpenFF Meeting notes

2025-07-02 Cole Lab/OpenFF Meeting notes

Participants

  • @Daniel Cole

  • @Finlay Clark

  • @David Mobley

  • @Jennifer Clark

  • @Joshua Horton

  • Julia Rice

  • @Jeffrey Wagner

 

Slides

off_newcastle_check-in-020725.pptx

Discussion topics

Notes

Notes

Fitting using smee

  • (see recording before 10 mins for notes before this)

  • DC 10 – if you only have a small number of torsions to train, you want electroststics and sterics to do some of the work for you

    • JH – So 1-4 electrostatics scaling is going down, so this is contributing less?

    • FC – Yes

  •  

  •  

  •  

  •  

Bespokefit-smee

  • JW: Could you explain the meta dynamics strategy?

    • FC: We are varying the torsion angles and adding biases to explore more of the conformer space

  • JR – Does OFF 2.2.1 include fragmentation as usual?

    • FC – No, other than the testing dataset being fragments from a datasets of larger molecules (fragmentation performed BEFORE tests are run)

  • DC: Was excluding the amides a conscious decision?

    • FC: Yes, I excluded because I expected high barriers that we wouldn’t prioritize sampling, but as you see it is important

    • JW: Yes they are usually 8 or 10 kcal/mol, is it that bad to give it a barrier of 20?

    • FC: No that would be fine, but I’m concerned that we are overfitting some low energy regions but excluding others with this high barrier

    • JW: Slide 28, the minimum shows a metastable minimum region that is not as broad as the in the QC region, is there a metric we can use as a penalty to achieve that?

    • FC: I’m not sure what the motivation of that strategy would be

    • DC: We definitely want a better fit in that region

    • FC: Im mostly concerned that it gets worse with additional SMEE fitting

    • DM – Amides are particularly tricky because of the are-they-puckered-or-not question, which kinda plays with impropers too. In practice they end up being a mix of wanting-to-be-puckered and wanting-to-be-planar, giving them a “squishy” wide basin, but the functional form of the impropers is limiting. We have played with alternative functional forms to resolve that.

  • (slide 31) DM: I had a conversation with the espaloma folks on impropers, are they using significantly more? Particularly for amides?

    • FC – Haven’t looked into this/can’t remember.

    • DC – Worth a look. You’re fitting impropers if they appear in the FF, right?

    • FC – Right

    • DM – But worth checking whether espaloma assigns a lot more impropers in general, I wonder if this is true in a comparison of OpenFF vs Espaloma treatment of amides.

  • (Slide 25) JR: With the high barriers the shape of the torsion seems to change at the bottom of the well. I expect that there is an indirect effect on sampling.

    • FC: Sure, something to keep an eye on, but I’m using 500K that should be thermally accessible.

    • JR: Sure, but I think with high thermal noise the actual high temperature simulations would show significantly different profiles in comparison to QM without that noise.

  • DC: Did you say you’re only going through the meta dynamics cycle once?

    • FC – Al of the metadynamics runs are from one cycle, tried doing multiple cycles on biaryl but didn’t get any improvement.

    • DC – I’d keep doing multiple cycles I’m not concerned about time now that we are operating within a small number of minutes, let’s focus on robustness first, and then accuracy.

  • JW – I wonder if the high energy regions are looking bad because some other term (bond/angle) are getting really strained, and the high energies that should be accounted for by those are getting dumped into torsions?

    • FC: That’s a good point, I’ll have deeper look into the outliers. May also be intramolecular hbonds that are complicating energy profiles.

    • DC: I agree that if the outliers aren’t accounted for in fitting that could be an issue so we can iterate on the lowest performing class of functional groups iteratively.

  • JH – Good to see espaloma being competitive here - Would be good to keep comparing to it and see if we can find regions where it falls down.

    • FC – I was looking at the ata and there are some outliers where espaloma gives crazy outputs. But those are pretty rare.

    • JH – It would be great if we could use this to characterize where espeloma falls short to improve it considering that it’s faster than OpenFF and Bespokefit.

    • FC – Would be interesting to use something like an ensemble of nagl models that try to predict things, and if there’s disagreement between then then it triggers bespokefit

    •  

    •  

    •  

    • .

Action items

Decisions