MS (slide 3): what’s the number of effective samples?
CC: for 10 ns windows, they’re on the order of a few thousand samples
MS: this doesn’t look good. You should be able to do better. It’s worth running for longer
DM: worried about overlap between windows (slide 1)
MS: looks low but not ridiculous
CC: can make force constant less strong
MS: agree that sounds like a good idea; should reduce the noise
DM: the issue is presumably a low number of uncorrelated samples in the overlap regions.
MS: should have relatively low bias but high noise.
DM: *circles between left-most pink and yellow lines on rep 2 of slide 1*, to point out that the overlap looks very small. Making the tail overlaps twice as wide will increase accuracy more than linearly.
MS: with 2k effective samples, my instinct is you should be able to predict pretty well. Suggest you run with wider bins and redo analysis, and guessing there will be a clear error.
DM: I would be more concerned about effective samples in overlap region.
MS: aiming for at least 5% overlap at absolute minimum. You don’t need as much overlap as you need for sampling.
DM: 5% is important, but also need a high number for overall samples too.
(slide 11):
MS: Pink pred/obs look ok… green looks pretty bad.
MG: my recollection is that the overlaps for the protein fits are similar to butane.
(CC shows histograms for GB3, Null OPC3).
MS: they’re not amazing but not as bad. We should look at the overlap matrix from pymbar/alchemlyb. MBAR.overlap should do it.
CC: and multiply by total number of effective samples?
MS: yes, plain matrix is nice too.
MS: also think replica-exchange could improve sampling. But would prioritise butane first to make sure there’s not an error somewhere.
Add Comment