2023-11-16 Protein FF meeting note

Participants

  • @Chapin Cavender

  • @Pavan Behara

  • @Alexandra McIsaac

  • @Lily Wang

  • @Anika Friedman

  • @David Mobley

  • @Michael Gilson

  • @Brent Westbrook

  • @John Chodera

  • Ken Takaba

  • @Matt Thompson

  • @Yuanqing Wang

  • @Michael Shirts

Goals

  • Benchmarks of FFs trained on NMR observables

  • Strategy for new QM fits

  • Data for Espaloma manuscript

Recording

https://drive.google.com/file/d/1WetoPLLVFtX-gVY2Rx114tP4iFjvVTws/view?usp=sharing

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

NMR FF benchmarks

 

@Chapin Cavender

  • Slides here

  • MG: [slide 26] FF14SB better at capturing helicity, but does worse here. Is improving this metric going to be meaningful?

    • CC: Agrees, thinks we can improve fits here but not sure if it will help overall. Considering revisiting QM fits to address this

QM fits

@Chapin Cavender

  • [starts at slide 27]

  • CC: What is the correct software versions to use? Issues brought up yesterday about fitting stack

    • LW: Changes in both OpenEye backend, but worse is issue with OpenFF software. Issue with OpenFF Interchange, if conformers don’t have same atom ordering, they are all assigned charges from the first conformer. Don’t use the most recent OpenFF stack, ForceBalance 1.9.3 is the last version to use, or wait until patch released.

    • MT: Hopes to have this out in a few hours to days. Would be good to have people check to make sure it’s actually working before relying on it.

    • CC: Wants to start fits now, will keep using ForceBalance 1.9.3 and Toolkit 0.10

      • LW: and 2022 version of OpenEye, to be safe

    • MG: Does this affect results so far?

      • CC: No, has been using earlier version so far. But wants updated toolkit features so would like to update software. Should not affect current results presented so far.

  • PB: [Slide 28] Didn’t see any difference when testing pairwise conformer energy for small molecules, used all of Sage 2.1.0 training set.

    • DM: Small molecules sometimes have weird things going on, proteins may not behave the same way, would be worth testing separately

    • CC: protein dataset also has 2D torsion drives for coupled torsions, may see things we couldn’t see on 1D scan on small molecule

    • PB: suggests using lower search tolerance. It’ll reduce the number of Hessian diagonals

      • CC: are you suggesting a 2-stage fit where one stage has a lower tolerance?

      • PB: yes

  • MS: is the idea with these experiments to get better alpha helix behavior?

    • CC: No, want a better starting point so want a better QM fitting procedure; other FFs have gotten a good fit without having to do NMR fits

    • CC: Scoring alpha/beta basins on Ramachandran map differently than other FFs, even though our QM is good we’re still seeing that

Espaloma Protein Benchmark

@Anika Friedman

  • Slides here

  • JC: Really cool work. Is it a defect with QM training data in the region, e.g. have we de-emphasized fitting in the region so the QM energies aren’t represented well, or is there an emergent property that causes devation?

    • AF: hasn’t seen any clear answers so far, would have to dig into it

  • MG: Do you have an intuitive sense of what’s happening? Does GB3 look like it’s unraveling?

    • AF: Yes, visually it’s unraveling in Espaloma simulations, takes a while to unravel but doesn’t re-order

  • What are we doing with torsions? [should be around 33 mins in the recording]

    • Ken: protein torsions is OpenFF peptide torsion 2D torsion drive and a few others, 3-mers, 3-mer omega datasets, 3-mer capped backbone, released 2022 and 2023

    • MG: Do thses have overlap with Chapin’s datasets?

    • CC: Training on ones called dipeptide 2D torsion drives, backbones and side chains of capped 1-mers, used 3-mers as validation

    • CC: Issues with converging torsion drives in 3-mer datasets

    • MG: So espaloma and Chapin’s FF are trained on some of the same data and some different

    • CC: Yes

  • KT: Slide 2, for side chains chi2 is >100, is that meaningful?

    • AF: Trying to figure out why it’s so high, both don’t have much data [30 scalar couplings for lysozyme vs many more for ubiquitin], trying to figure out if it’s something to do with the protein target or what

    • CC: Need estimate for systematic error in Karplus [?] model to get chi2, trying to see if this error is overestimated → inflating chi2

  • JC: What pH and temp were the experimental measurements done? Sometimes have extremes to keep it protonated

    • CC: pH 6.5, T~25 C ish.

    • JC: GB3 is relatively unstable at that temp/pH, are they sure it’s mostly folded at those conditions? These simulations would only sample folded state

    • CC: scalar couplings are taken at different pHs than the backbone in some cases, same pH in others. pH could also affect Karplus models

    • CC: Should also look at other observables besides scalar coupling. Has chemical shifts implemented now, or order parameters should be available.

  • KT: Where are histidines in GB3?

    • CC: There aren’t any histidines

    • JC: pH likely chosen to prevent (or enhance) exchange of protons with deuterium solvent

  • KT: Timescale of unfolding?

    • AF: Gradual process, still trying to quantify, 3 residues gradually disorder over 1.5-2 microseconds. Trying to quantify a cutoff point

    • KT: May want to plot RMSD to see how structure changes over time with different FF/runs

  • AF: Wondering whether we should include this in Espaloma manuscript? [around 45 mins in recording--went a bit too quickly for good notes]

    • KT: Thinking of resubmitting end of November, wants more analysis like RMSD plot, etc to understand data. But would love to add once we understand better. Could add during peer review? For resubmission, wants Chapin’s peptide results from 1-2 meetings ago. What do others think?

    • JC: Would want a v2 of peptide results that better compares

  • AF: OK, will start doing more analysis and will share

Espaloma manuscript

@Chapin Cavender
Ken Takaba

  • CC: Anika will do more analysis on proteins, CC will share “v1” results and methods section

  • JC: Want to avoid making choices informed by too small of a set of benchmarks, but this is a good starting point. Is there a way we can avoid being biased by looking at too few assessments, make sure we’re not over-optimizing for one particular benchmark?

    • CC: what do you mean?

    • JC: A lot of optimizations have been very targeted on a few benchmarks, tweaking things that affect small bias between alpha helix vs not, do we have enough breadth to make sure we’re not neglecting other metrics

    • CC: page on protein FF confluence plan has tiers of observables to benchmarks based on how easy/long it will take

    • JC: looks like these benchmarks also have bias toward helical structures, but also there is a beta one

    • CC: idea was to pick one alpha and one beta

    • JC: Are there longer peptides?

    • CC: Could add up to 7-mer but decided not to due to weird behavior

    • K19 is about 40% helical at room temp, to go beyond that would have to look at more unusual/engineered structures

  • MG: Would be interesting to try simulating another protein and see if it has same helix problem

    • CC: Wasn’t sure if that was how we wanted to spend our CPU time, but could do it

    • MG: 5-mers have some helicity, but when trained on those didn’t get better results, maybe don’t do that experiment then.

Action items

Decisions