2023-06-15 Protein FF meeting note

Participants

  • @Chapin Cavender

  • @Michael Shirts

  • @Anika Friedman

  • @Lily Wang

  • @Matt Thompson

  • @Jeffrey Wagner

  • @David Mobley

Slides (new slides start from 17)

Edit 2023-06-20: corrected plot on slide 20 in response to Jeff’s question below

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

HMR sanity check

@Chapin Cavender

  • CC will link slides here

  • CC – This was increasing solute hydrogens by 3 daltons, using langevin middle integrator.

  • AF – What timestep for equilibration?

    • CC – Short 1ns equil with timestep of 1fs. After that I do 4fs timestep.

    • MS – That makes sense. AF noticed that HMR with 4fs timestep started failing at high temps.

    • CC – Thinking that’s because of water? It may be needed for high temp replica exchange.

    • AF – MS – Good question, not sure. We could try it out.

  • CC – Any objection to using these settings for the rest of the protein benchmarks?

    • MS – Looks good to me.

  •  

Helical peptide benchmark

@Chapin Cavender

  • CC – Previously I’d shown alpha helix in GB3 unfolding in TIP3P, but not OPC. So I wanted to see if that was happening with alpha helices in general.

  • (slide 18)

  • (slide 20)

    • JW – The replicas using the same FF but starting from different geometries are so incredibly close that I wonder if there’s a bug in the data.

      • MS – Agree

    • CC – I saw this and looked as well, but I did see slightly different variations in the columns. I also simulated each of these for 60us across all replicas, so they may just be really well converged.

    • Edit 2023-06-20 There was a bug in the plotting script. The corrected slide is posted above, and the data is qualitatively similar.

  • MS – Next steps?

    • CC – Talked with MG. We might look at QM data and see whether alpha/beta pops agree with what we’re seeing. We have capped ala and glu torsiondrives in the datasets, so we might see if there’s something in the capped 3-mers that wasn’t reflected in the capped 1-mers that we trained to. If that doesn’t show any discrepancy, we might “tune” our QM data to reproduce experimental data (like, AMBER tuned their backbone QM to reproduce ALA5 scalar couplings). But if we can’t fix that… (see recording ~17 mins). If that doesn’t work, we might modify our training data to include things like we’re seeing for larger structures - This might include doing QM on larger structures, or doing implicit solvent for the QM calcs.

    • DM – Re: tuning QM data to reproduce scalar coupling… (recording, 22ish mins)… On one level, the things we want is “ff14SB but in our world” - If we’ve critically diverged from ff14SB by not tuning our QM data, maybe we should revisit that. Obviously we could do more exciting and different things, if all we delivered is a fully consistent version of ff14SB that would be a considerable success.

    • CC – Yeah, I could see that being an appropriate initial release, and then we could go further back in the pipeline in subsequent releases to clean up the pipeline. That would be the path of initial resistance to getting a FF out soon.

    • MS – Looking at QM and tuning may be a logical first step.

    • CC – Do we think that tuning to scalar coupling is the right thing to do? Or should we tune to like GB3 scalar couplings or some sort of helicities?

    • DM – My thinking is, if we do something different, it may or may not “work”, and we don’t know what that path may end up looking like. OTOH, if you do something that worked for them, it’s a better-trodden path.

    • LW – Largely agree that it would be best to take the safer path.

    • CC – That makes sense to me. Since we started exploring water models, we should name a reference water model for this tuning?

      • LW – I’d say that, if we can pick, we shouldn’t pick a 4-site model now.

      • MS + JW – Agree

      • CC – Yeah, this will more strongly “bake in” dependence on a water model.

      • MS – One idea is to stick to TIP3P until we refit our own water model. This will reduce the total number of water model shifts.

      • CC – … Worth noting that ff19SB moved away from TIP3P, and we know that TIP3P gives the wrong rate of protein rotation and such.

      • MS – Maybe TIP3P-FB then? If we switch, we should base it on which water model is best for physical properties.

      • DM – I’d advocate not trying to do a new water model study ourselves, and recommend that CC pick a 3-point model that’s likely to give good results and start from there.

      • MS – Have studies been done on water models with solutes?

      • CC – I’m not aware - Generally protein-water model benchmark studies just pick the FB family or the OPC family and just compare within that family. My previous work (slide 13) shows that TIP3P-FB and OPC3 give nearly identical results, with OPC looking slightly better.

    • DM – If we decided to look at tuning the QM data, what would that pathway look like?

      • CC – I’d start by looking at our QM data to see how much of a difference it would make and how to implement it. This would take on the order of weeks.

    • MS – AF and I will try doing a study of water models to see…

  • CC – Could I make changes in ForksBalance to try changing our fitting?

    • MT – Let me get back up to speed on this and we can chat.

    • JW – I think the answer is likely “yes” - We can give you fast reviews and packaging support but we don’t know much about ForceBalance so our reviews will be far worse (but faster!) than LPW’s.

  • CC – I’ll do experiments to pick either OPC3 or TIP3P-FB for our tuning. (Please review this decision point)

  •  

  •  

Benchmark Allocation Request

@Michael Shirts @Anika Friedman

  • AF – Are we looking to rerun GB3 with all water models in gromacs?

    • MS – This will be changing now that we know we need to refit the FF

    • CC – Yeah, we still need to run lysozyme, gb3, bpti, and ubiquitin, and we should add a ~50% padding on top of the prediction there… (something like “also XXX and chignolin hairpin”, see recording ~40 minutes)

    • AF – For the scaling that I did with gromacs on thsoe systems,

  • AF – Getting 600-800ns/day on 32 cores, which should be a good mix of speed and efficiency. We want to consier 4 (possibly +3) FFs and 8 water models. That will eat up 90ish% of the 3 million CPU hours we’re requesting. The temperature replica exchange will also be quite expensive.

    • CC – I see the temperature replica exchange to be a late-stage benchmark, which would be used for a release candidate FF.

    • MS – … We can keep looking at the numbers, but the big thing is to get the allocation and then we can rebalance between our sims as needed.

    • MS + AF – There are lots of cores/nodes and it’s likely we’ll be able to do these runs in parallel once we start submitting jobs (especially in the summer).

  • AF – Before we start doing 10us runs, I wanted to do more validation of gromacs vs. openmm.

    • CC – I think doing this just for short peptides (like ALA5) would make sense - We know that will converge within ~500ns. Then we can compare the gromacs observables directly to openmm observables.

    • MS – But that shouldn’t be necessary before requesting the time.

    • AF – Right, we can go right ahead with the request.

  •  

  •  

  •  

Replica Exchange Simulations

@Michael Shirts @Anika Friedman

  •  

Action items

Decisions