CC: Agree – however, MS mentioned that repex would give better results than independent windows, as you can move between windows. An individual replica would find it difficult to transition between multiple minima as seen from the 0.7 to 1.0 fraction native contacts landscape
MG: that would require additional work, whereas running longer is much cheaper
LS: have you looked at the structures in the intermediate replicas? I am pretty convinced we have convergence issues from the data. We can answer the question of whether we’re forming alternative structures by visualizing pairwise RMSDs etc
CC – Thought about it, but haven’t run the analysis. That would help characterize what we’re getting in these ensembles. Should be a pretty straightforward analysis. I’d also proposed changing the collective variable to not be the native contacts of the entire protein, but just the helical residues.
LS – I recall, my initial reaction was in agreement wiht MG, that you make the sampling problem worse if you don’t include those dofs. OTOH, the uncertainty in sampling might be reduced by narrowing the umbrella windows. Neither answer is clearly right with the info given. Seems like 3 options:
Repex seems most appealing to me, but longer windows seems the most straightforward.
CC – Agree that making the windows longer will take the least of my time, and that repex would take a lot of my time but have a better chance of improving results.
MG: don’t see these as mutually exclusive. Can extend first and investigate other directions as they’re running
LS – Agree… Wonder how we can ensure we get a representative sample in each iteration.
MG – Is there evidence that running longer won’t bring things into alignment? Agree that repex will do this better, but there’s not clearly a significant barrier to overcome.
DM – This seems like a 1/sqrt(n) thing, where you’d need 100 times more simulation to get 10 times more accuracy
LS – Repex would help avoid kinetic trapping. It’s possible that two structures at 70% native contacts won’t interconvert if they need to pass through a 80% intermediate structure to get there.
MG – I don’t see evidence of kinetic trapping, though how tight are the windows?
CC – At one point, I did a histogram to look at overlap between windows. My initial attempt didn’t yield enough overlap, so I reduced them, now the overlap is 10-15%. But doing pairwise RMSD might give a clearer signal of meaningful overlap.
MG – I see. Maybe it’d help to do a series with softer restraints to encourage overlap. Or would this make convergence harder in some other way?
CC – I think you’d generally get more efficient sampling to converge the FE surface if you don’t allow overlaps muchmore than half a window.
LS: I agree from intuition
CC: other option is to just run additional replicas with same energy constant
LS: could you just do different seeds for starting replicas?
MG: how hard would replica exchange be?
CC: there are protocols for doing it in OpenMM. I would need to get simultaneous GPUs for running the exchange. I can probably do this on the gibbs cluster but would take some time to play around with slurm options. < 1 week estimated, optimistically
LS – Shirts group recently published about ?something? related to this, may have tools to solve/help here.
AF – Not sure that that’s applicable to this problem (communication issues, no details)
MG: how do you plan to stage the different approaches wrt cumulative fit, replica exchange and so on?
CC: starting new replicas is easy, can do that immediately. Expecting butane results in next day or so so can move forward with that. Can integrate those together and start looking into replica exchange while those run.
MG – That sounds like a reasonable approach.
LS – Makes sense to me. But if the windows are restricting our sampling too much then it’s an uphill battle to throw more sampling at it.
CC – So I’ll keep running the current sims for longer, I’ll look at whether window overlap is an issue, and I’ll work on starting up repex on our local cluster.
DM – Sounds good. Agree that running longer is the way to go.
LW – Agree
CC: new replicas vs longer?
JW: new replicas seems better for avoiding kinetic traps
MG: sounds like it makes sense. What do people think about widening harmonic restraints?
LS: it’s another easy thing to try assuming compute resources aren’t the limiting factor.
DM: sampling is probably best as you get towards the most native structure. Have you looked at all-to-all (pairwise frame) RMSD plots, including combining replicas? Could see if the same contacts are discovering the same conformations.
LS: similar to what I suggested earlier. Again could do n replicas per window, could be a bit sketchy stats-wise.
DM: would prioritise just getting things running, and then looking at sampling quality
CC: makes sense, I think I have a game plan.