2023-04-20 Protein FF meeting note

Participants

  • @Chapin Cavender

  • @Pavan Behara

  • @Michael Gilson

  • @David Mobley

  • @Jeffrey Wagner

  • @Lily Wang

  • @Trevor Gokey

  • @Matt Thompson

  • @Michael Shirts

  • Anika Friedman

Goals

  • Analysis of GB3 trajectories

  • Other benchmarking resources

Recording

https://drive.google.com/file/d/1m2baL5P30aiNaSgXq0tsm7vXPv1RDtzP/view?usp=share_link

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

GB3

 

@Chapin Cavender

  • CC – summary of issues with previous SMIRNOFF simulations – alpha helix unfolds from C-terminus in at least one replicate

  • CC – currently running simulations of short alpha-helical peptides with amide cap. This required deriving LibraryCharges for the amide cap, same procedure as I used for others. I get charges similar to previous values, and they only differ in 4th decimal place, so probably close enough without repeating with 5-mers

    • MS – why was this necessary?

    • CC – molecules in NMR experiments have caps, and other FFs support this, so seemed like we should support it anyway

  • CC – simulations with AMBER can sample helical conformations within 1 us, continuing them to hopefully see multiple un/folding events

  • CC – hoping to show results at next meeting

  • CC – trying the Ramachandran clusters of secondary structures of unstructured short peptides doesn’t seem to show the short peptides to sample the alpha helical region much.

    • MS – using AMBER or SMIRNOFF?

    • CC – AMBER is also not helical (shows slide)

    • MS – but there are differences between them

    • CC – basins (beta, P_II) favoured by SMIRNOFF vs AMBER respectively are close to each other on the map

    • CC – alpha helix basin is subset of delta basin for ala3

    • CC – delta basin is sampled a bit more by AMBER but only by a few %

    • CC – we see more differences in Ramachandran clusters for gly3

    • CC – repeated analysis for GB3

      • SMIRNOFF samples less in alpha region than AMBER, although this is likely indicative of the helix unfolding. They are similar in delta region

  • CC – looking at side-chains with same Ramachandran analysis now (slide 63 – peptide GVG)

    • t = trans, m = gauche -60, p = gauche +60

    • in PDB, rotamers are almost always trans, but much less so in simulation. In simulation, rotamers are often at angles that fall outside the t/m/p bins

    • See similar behaviour in val3 and gb3 too

    • Looking at methionines, we get more complexity

      • Sample quite different side-chains in AMBER vs SMIRNOFF. Not sure which is more correct

  • CC – NMR observables for GB3

    • AMBER models them better than SMIRNOFF. Error bars from bootstrapping

    • MS – so Null is doing better than Specific?

      • CC – yes

    • MS – but we do comparably or better on smaller peptides?

      • CC – yes, there are certain residue types where we model them better than AMBER (slide 29)

    • Slide 74

      • CC explains key of x-axis. nh == amide, cb == beta carbon, cg == gamma carbon, etc.

      • In general AMBER models side-chains and H-bonds better

      • CC – Both FFs struggle with H-bonds (last column), but SMIRNOFF does about twice as badly

      • MG – is this analysis telling us somethign we didn’t know? If we took out the problematic helices, would these numbers settle down?

        • CC – can do this experiment easily, but what this is telling us already is what the differences are between the two force fields. When we compare backbones, we don’t see much difference, so they’re probably not the problem. The problem is likely in the side-chains and H-bonds.

        • MG – the unfolding is the result of many cooperative events, and it’s hard to know how to weight what.

        • MS – correlation vs causation – if it’s started unfolding, the H-bonding will be worse

        • MG – Yes, it’s hard to know what to pinpoint here.

        • CC – Agree, will repeat this analysis without the unfolding helix. But we only see the unfolding happen in one replicate out of 3, and it takes a µs to get there. We’re sampling this unfolded state at worst, a third of the time. But the errors are 2x worse for SMIRNOFF

        • CC – will repeat this for 2 replicates without unfolding

    • CC – my takeaway: we should focus on investigating side-chains instead of backbone

      • DM – what are good focused experiments we could do to isolate where things are going wrong? We would need a cheaper test than simulating the whole system.

      • DM – for nonbonded errors, if we’re using LibraryCharges, that means problems would be from our LJ refit, right

        • CC – could also be from the charges. Comparing ours to AMBER’s, ours seem larger for polar groups and smaller for non-polar

        • MS – that would make H-bonds stronger, not weaker

        • DM – should we swap in the AMBER charges and see what happens?

        • MG – are you suggesting a PMF or another GB3 simulation?

        • (after some discussion) GB3. Probably will take a week or two, it’ll take little of my time to set it up while I do other things

      • MS – it would be more interesting to look at deviations when it’s folded than unfolded, because that’s causing the unfolding.

      • MS – going back to the BB analysis, there are clear differences in how AMBER does better on gly/val and SMIRNOFF on bulky hydrophobic residues

        • MS – maybe we should do an alanine dipeptide PMF

      • MG – did the SMIRNOFF ffs have higher RMSDs than AMBER?

        • CC – yes, but on the scale of 2.0 vs 1.8 A

      • LJ parameters might be of interest, swapping them out for AMBER values should be straightforward

        • MG – I’m interested in the amide hydrogen

        • DM – it was already non-zero – we made hydroxyl hydrogen non-zero for the first time here

    • MG – if we need to tweak parameters, do we propagate the changes back to small-molecule world?

      • MS – yes

      • MG – proteins might be more sensitive to subtleties, if they’re near conformational change

      • MS – most fully folded proteins are stable 10-15 kcal, or at least more than 0.5 kcal

      • MS, MG – historically protein FFs have been overstabilised – it’s better now, but just because we have a simulation that’s more stable and folded, we can’t necessarily say this is better without other evidence

      • MS – it’s more problematic that we do worse on experimental observables

Benchmarking resources

@Chapin Cavender

  • CC – maxed out of computing resources. I can share my scripts if other people can run these

  • DM – are these benchmarks ready to go? are they already set up?

  • CC – main unknown is what starting structure should be used, what solvent conditions should be used. There’s still some work to figure out what temperature the NMR was done, etc. But I can look into that and work them out in a single work day

  • MS – one possibility: Anika could do this in our group.

  • MS – can we use non-OpenMM engines? e.g. gromacs.

    • CC – can share my code (which was intended for release in the future) and point out where you can use gmx instead of openmm, assuming interchange supports it

    • MT – I’m pretty confident in interchange, not sure everyone else should be. They’re ready for use - make sure this is using 0.3.0 or newer

    • MS – suggests comparing energies at time 0 between OpenMM and GMX as validity check

  • MS – can get 3 million hours in a week of time. Have you asked for time via ACCESS yet?

    • You can now write proposals per grant and get up to 1.5 million units (1 unit ~ 1 cpu hour). Turnaround time for grant is a day. You can write a proposal for each grant, so if you have multiple grants supporting the project, you can have multiple proposals

  • DM – if Anika wants to, benchmarking more systems would be beneficial (general agreement)

  • MS – CC and AF and I can meet to figure out allocations of work

  •  

 

 

 

Action items

Decisions