2024-07-17 FF Fitting Meeting

Participants

  1. @Lily Wang

  2. @Brent Westbrook (Unlicensed)

  3. @Michael Shirts

  4. Bill Swope

  5. @Chapin Cavender

  6. @Christopher Bayly

  7. Julianne Hoeflich

  8. @Alexandra McIsaac

  9. Patrick Frankel

  10. @Michael Gilson

  11. @Pavan Behara

  12. @Matt Thompson

  13. @Jeffrey Wagner

Discussion topics

Item

Notes

Item

Notes

Phosphate fitting

Recording: Video Conferencing, Web Conferencing, Webinars, Screen Sharing
Passcode: %AKuLE0&

  • Recap from previous ff-release meeting: 2024-06-20 Force Field Release Meeting notes

    • Got weird results with POPC – compressed and moving slowly (APL too small, thickness too high) – see last presentation

    • Distributions of phosphates were ver different to Slipids and Macrog

    • Not really any phosphoesters in Sage training set

  • Currently: have dataset of similar chemical groups. Discussed including CC’s data

    • One of the TDs is failing so data not finished yet

  • Phospho- torsions

    • ~5k torsions without filtering for too many conformers / ring torsions

  • Alkane tail parameters

  • CB: if there’s a free -OH group in the phospho-monoester, would there be strong interactions?

    • MS: they’re all esterified in lipids, there should be no free OHs. I don’t think PF fragmented in this region

    • LW – Should we filter out torsions involving these?

      • MS – That’s my instinct - We’re not interested in those.

    • LW – In response to earlier comment from BW - We should filter out ring torsions.

    • JW – A 1D TD usually does ~50 scan points. So 250k optimizations would take a couple weeks. Happy to run that.

      • BW + MS – There’s some filtering we can do to thin this down (remove ring torsions, unwanted torsions). Likely won’t be 5k.

      • LW – Are any of these pure alkanes? Might not make sense to mix them into this.

      • BW – Yes, I can filter those out too.(After filtering t2 and t3 matches, down to 3k - and still need to filter ring torsions)

      • MS – Agree, let’s filter those out.

      • CBy – Bridging torsions from glycerol to ester would be good to include

      • MS – Agree, but we don’t need the purely alkane ones.

    • JH – If we’re looking to fit branched alkanes for bonded interacttions, and we’re interested in angles and torsions (and we’re concerned the angles are incorrect), in which order would we do that fit?

      • LW – We typically fit them simultaneously - They’re related but not completely interdependent. So we wouldn’t try to fix sequentially, we’d fix them at the same time. I know MS lab was looking at alkane parameters, did you encounter issues with angles?

      • MS + JH – A recent AMBER paper identified that angle parameters can affect lipid tail behavior, and they did a refit to overcome this. So I compared OpenFF to that and found we have similar behavior to before they made these changes. Change was to equil angle, not force constant.

      • MS – JH is looking into diffusivities of alkanes, hypothesis is if torsions are too stiff diffusivity will be too low.

    • MS – After QM dataset is complete, what’s the plan for refitting?

      • LW – Up for discussion - If you’re interested to try doing refits we’re happy to work with you. To give a commitment from our side I’ll need to talk to lead team.

      • MS – If JH is interested she could give this a shot. But I’d like to keep our involvement focused on fitting and not QM for now.

      • JH – Yeah, I’m up to learn this.

      • LW – Ok, great. I’d recommend waiting for alkane dataset to finish to start doing fits.

      • MS – Great, and AFriedman got the fitting workflow working on the CU Boulder cluster.

      • LW – That’s good - We have the fitting workflow pretty smooth on the UCI cluster, and we can help with any issues you have there.

    • CBy – WRT lipid properties, and having membrane too stetched/compressed - Is there generally accepted lore about which FF parameters affect the observed properties?

      • MS – Good Q. Worth having a meeting with eg Jeff Klauda and Olieva? about something similar to our protein best practices. Could be good to line this up timeline-wise with completion of the lipid dataset

      • CBy –

      • MS – Want to eliminate the wizardry/hand tuning as much as possible.

      • CBy – Thinking more about “blocking the matrix” - that is, if we divide parameters into bonds, andlges, torsions, nonbonds, etc - and when we do a global fit we fit all the parameters at once. But it would be reasonable to only fit the parameters in this case that are logically related to/likely to affect the parameters we care about. …

      • MS – Currently LJ params are fit separately from valence.

      • CBy – Just thinking that lipids are systematically different form the other types fo mols we’ve fit, and wondering if there are changes in the fitting strategy we should make.

      • MS – Intuitively, we operate on the assumption that doing things right for small molecules should inherently get things mostly right on larger molecules made of the same components. So I’d prefer we continue on that path until we have evidence that it doesn’t work.

      • JH – Here, we wouldn’t be making a FF specifically for lipids, it would be integrated into the general small molecule FF, right?

        • MS – Right. This shouldn’t decrease performance on small mols, our projects for macromolecules should just expand the domain of our accuracy to larger mols and the sorts of things we see in lipids/proteins.

      • PB (chat) – @Brent Westbrook can you split the large dataset of 5000+ torsions to smaller datasets so that we can use some for training and some for testing

        • BW – We were thinking of putting the datasets through as both torsiondrive and optimization dataset, those could be treated as separate.

        • LW – But also splitting things initially would be a good idea for testing/training down the line. Also agree with JC’s suggestion about fragmenting large dataset of diverse lipids. But if fragmenting lipidmaps dataset gives us basically the same as the current dataset then we should go ahead with training/test split.

        •  

  • MS – Alkanes - Alkane groups in our videos/sims looks fairly inflexible. It got us wondering whether our alkane barriers are too high. In the video we were seeing the time averaged H positions being on top of their parent carbons in slipids and macrogs sims because they were rotating so much. But in OpenFF they weren’t rotating very much at all.

    • (see recording ~42 mins for fragments and profiles used in fitting alkanes)

    • LW – We didn’t use all of these in fitting alkane torsions - Just 5ish IIRC.

    • MS – I see

    • BW – I see 10-15 for t2 and t3

    • MS – If we have a parameter that’s responsible for collective behavior - like when a bunch of torsions are strung in sequence in an alkane tail - little discrepancies can become more pronounced. So i’m wondering if we should add more alkanes to the set and give them higher weighting.

    • CBy – (see recording ~45 mins) - (Some torsiondrives may not be ideal for fitting general params. Some fitting mols may be too complex)

    • CBy – I think a problem is that OpenFF looks at alkanes mostly in the context of being a functional group/attachment to a druglike scaffold. But this emphasis on diversity means that we prioritize getting small mols right, possibly to the detriment of molecules like lipids. So 1) agree with MS that we should put additional emphasis on lipid alkane torsions. 2) We can’t let small mol FF nudge critical params in a way that decreases quality

    • JW – Might be good to add specific torsions for longer alkanes to keep this from being zero-sum between small mols and lipids

    • LW – Agree, doesn’t have to be zero sum. In small mols, alkane torsions are sometimes optimized to get the whole mol right, not just the alkane part. And one nice thing is that we can tweak those to get good behavior on long alkanes, and they won’t have much of an effect the overall energetics of small mols.

    • CBy – Agree, effect of fitting to longer alkanes on small mols is unliekly to be signficiantly detrimental - those probably have a lot of room to vary without significant effects.

    • MS – Right, I think there’s a good chance that this will improve overall quality, even for small mols.

    • CBy – Could also nudge priors during fitting to limit damage. (jokingly calls it “cheating” but is unironically in support if needed). Analagous to putting extra emphasis on protein backbones.

    • LW – Suggesting a similar approach to proteinFF project, with “null” vs “specific” FFs?

    • CBy – Ideally a long alkane chain in a lipid is the same as a long linker in a small mol. But

    • CC – Agree with strategy - We hope that things will generalize and we won’t need new params/nudges during fitting, but protein FF project shows that we do need some intervention - eg adding smirks, changing priors

    • MS – Certain measurements amplify effects of certain aspects of fitting/changes to particular types of parameters. So this kinda focuses uneven attention on certain parameters/types of changes, and it makes sense that we change how we do parameter fitting because these give us an opportunitiy to get a better-resolution view of some aspects of the FF

    • CC – The alkane torsion may have different preference in gas phase vs. polar solvent vs nonpolar solvent. So agree that splitting may be necessary to get the right physics.

    • BS – Richard Pastor was doing lipid stuff using charmm FF, and had trouble until he started including area per lipid.

    • MS …

  • JH will start training doing fitting

  • MS lab will propose alkanes to do

    • .

Action items

Decisions