2024-04-04 Protein FF meeting note

Participants

  • @Pavan Behara

  • @Chapin Cavender

  • @Anika Friedman

  • @David Mobley

  • @Michael Shirts

  • Louis Smith

  • @Jeffrey Wagner

  • @Lily Wang

  • @Brent Westbrook

  • @Alexandra McIsaac

Recording

https://drive.google.com/file/d/1QcdyWw5mXxyeUhLciIEWUCjGQyLadsFQ/view?usp=sharing

Goals

  • Benchmarks of 0.0.3 FFs on GB3

  • QC datasets for RNA

  • Adding to QC dataset for proteins

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

GB3 benchmarks

 

@Chapin Cavender

  • JW (slide 11): have the simulations on this slide finished yet?

    • CC: still unfinished – prioritised folded protein trajectories instead.

    • JW: Ok, so these are identical to the plots from last meeting.

    • CC: not expecting extended trajectories to change qualitative results, although maybe variation on the helicity of the peptide across residues

  • LS: did moving to the single-point workflow mean that the 15-mer no longer folds to a helix?

    • CC: haven’t run this benchmark with this FF yet

  • JW: do you have a hypothesis why smaller benchmarks and larger benchmarks seem to show such different results?

    • CC: we see that the unstructured peptides are often anticorrelated with larger benchmarks, maybe because they’re unstructured and don’t populate the alpha and beta basins

    • LW: does this extend to the 15-mer finally folding to helices?

    • CC: if we go back to free energy of beta to alpha, it might be that alpha structures are too stable and other factors are destabilising the alpha helix in the folded protein

  • PB: could it be nonbonded parameters?

    • CC: haven’t ruled it out, but have been focusing on what other research groups have done, and most major bio FFs use same nonbonded terms as their small molecule FFs. I’ve seen other FFs modify their torsions and fix the problems

    • CC: could try to work in nonbonded terms, e.g. by fitting to 15-mer simulations

    • DM: another way to test this is to go to the Amber nonbonded parameters and try again?

    • CC: I tried this with 0.0.2 FFs, swapped in 1. just charges, 2. both charges and LJ, neither variant gave helical content for 15-mer. Could just be you shouldn’t Frankenstein FFs.

    • DM: what about re-fitting with Amber NBs? If it gives good results we can focus on nonbondeds.

    • CC: fairly straightforward to do a reweighting to 15-mer without additional work, since I already have the scripts needed. I couldn’t do this before since our helical content was 0. Now that the Specific 0.0.3 has some helical content, we can reweight this to get rest of the helix correct.

  • LS: can you look, atom-by-atom, at the differences in energy between ff14sb and the SMIRNOFF candidates in the unfolding simulations? Perhaps there’s a particular LJ pair or charges that would become clear with this analysis

    • CC: I have not looked an energy breakdown for those yet

    • MS: entropy would also play a role though

    • LS: CC previously mentioned torsions were pretty similar, basins pretty similar to 14sb, so expect that part of the energy to look similar.

    • MS: not sure I agree

    • MS: could take one simulation and re-evaluate with 14sb and see if there’s any particular configuration that’s particularly high or low energy. Or vice versa.

  • PB: if you had different results in different water models, would you attribute differences to torsions or nonbonded?

    • CC: would attribute to nonbonded via intuition

  • LS (slide 17): are you checking that the total sum of alpha + beta matches what you expect? Could get similar ratios but poor populations.

    • CC: (slide 13): haven’t explicitly quantified, but slide 13 shows ratio of populations across more basins

  • CC: will look at energy breakdowns between force fields (in both directions)

RNA QC datasets

@Chapin Cavender

  • PB (in chat): On the small molecule side we may have to look again into phosphate related parameters, some angle and torsion params were modified in 2.1.0 but I think there is still room for improvement. Paul Labute from CCG complained about some stiff torsions as well related to Phosphorous.

  • (some discussion on scaling implicit solvation calculations): ~42 min into recording

Protein QC datasets

@Anika Friedman

  • MS: are all structures weighted equally?

    • CC: … (~49 min into recording)

  • MS: are all QCA structures used in optimization?

    • CC: I have side-chains on 1-mers used in training, backbone scans on 3-mers not used in training. Optimizations on 1-mers and 3-mers that are in training. The clusters in slide 4 are probably used in training, the regular points are likely scans that may or may not be.

  • MS: looks like the few representations in alpha region may be drowned out by over representation in delta

  • JW: points out areas that are well represented in the right and not in the left

  • DM: some of this might be xtal artifacts, since these are from whole proteins.

  • PB: do you think single points would fall back to left-hand plot?

    • DM: they’d be geometry optimizations

  • DM: would suggest spot-checking that segments of proteins do or don’t converge to similar structures in geo opt

  • PB: how about a single point dataset?

    • JW: or a dihedral constrained optimization?

  •  

Action items

Decisions