GB3 NMR fit | @Chapin Cavender | JR: any estimate of solvent effects in the QM data? CC: no MS: this is irrelevant; the NMR fit doesn’t involve QM. It may explain why the original FF is so bad. JR: might the original QM impact affect the reweighting? MS: we’re looking at why the reweighting fit isn’t working.
JR: any concerns about h-bond modelling? CC: earlier we swapped in AMBER nonbondeds, we still saw the same unfolding. JR: OPC vs TIP3P? (General): TIP3P had worse problems
MS – Re loose vs. stiff - the distance between pred and obs here is troublesome JW: the y-shift is normalized to give 0 at the same spot, which might make the curves look more different MS: the different slopes also concern me DM: seems like there’s either something wrong with the umbrella sampling analysis code, or the sampling/overlap is insufficient
MS: in looking for bugs, I would start with the results on slide 6. I would expect changing the FF would change the free energy. CC: the dotted line is a prediction from the dashed line in the same row of the legend. MG: (clarifying the interpretation of the graph). MS: it may be the -Pred curves used FFs where parameters didn’t change very much, and it’s possible that the noise is overwhelming it. CC: I generated these plots with error bars from bootstrapping from pymbar. I didn’t visualise as it was busy, but they were out of error.
MS: if the number of effective samples is high, the parameters may not have changed very much, and then it’s unsurprising the sampling hasn’t changed very much. MG – But if the force constants didn’t change, why did the FE distributrion change so much? MS – Could have been a sampling issue. N_eff samples could give us a sense for this. If it’s 99% then it’s basically the same FF, since every conf has the same energy. JR: would it help to change window spacing? DM: worried about the gaps between windows, e.g. in the bottom right. Sometimes I’ve had to introduce windows or non-uniformly weaken the force constants. In the top right, that purple curve is bimodal – could be something wrong in that region. MS – If you add all those histograms together, you’ll get decent overlap. But the fact that two replicas are so different is more evidence of insufficient sampling. I’d expect more consistency if there’s adequate sampling. MS – I wonder why we don’t see the same issue for stiff… The predictions don’t agree with either the previous or next obs. But for loose they do. MS: the change in parameters do align with my expectations that the “loose” ones change less so the curves would overlap more.
LS – My take is in line with sampling issues. I’d add that the “notches” in the loose histograms are problematic. In stiff histogram things seem stable/cooperative, but for loose there are barriers. Maybe the stiff histogram also isn’t showing great overlap, and that’s also leading to issues with application of MBAR. And so loose sims might also have overlap problems, AND sampling problems. So I’d advocate having twice as many windows. MS – To check overlap, you can just check overlap matrix. CC – I’d asked what “good overlap” is before, and I got 5% as the answer. So I’ve been ensuring that overlaps are generally around 5%. However, didn’t achieve this in the stiff energy constant. MS: what’s the overlap on the diagonal? It’s not always 1, it’s the probability that that sample came from that window. Perfect overlap would give 0.33 between the window or adjacent windows. If it’s 0.99, the energies are quite different. Ideally would be <=0.8. CC: I think I was following this procedure.
(Lots of detailed discussion involving charts/numbers on screen, see recording starting at 48 minutes) MS (~65 min) would aim for chi squared of 3, with 30-40k effective samples, could try a few in parallel. LS: could you just re-run the fitting part of the procedure until some % N_eff is reached? MS: could try changing alpha. Are you using a stochastic optimizer? CC: No, L-BFGS. Should be deterministic. MS: it’d be interesting to see how different the FFs would be.
|
4-mer QM dataset | @Anika Friedman | MG: what’s the lowest energy conformer? JW: can the minimum conformer be in any basin of secondary structure? AF: Yes. MG: would it make sense to look at energy differences between alpha and beta directly? AF: we don’t have many molecules with conformers in both basins JR: do you have enough data to draw statistical conclusions? AF – If we make this comparison for differences between QM and MM in whole basins, then yes. But if we are looking at a single molecule, we often don’t have the same mol in multiple basins.
LW – When looking at energy differences, are the minimum conformers the same in QM and MM? AF – They can be different. LW – Maybe we should standardize on the lowest QM energy conf. If we do lowest energy MM conf for the MM stuff then the basin that it falls into could introduce big errors. Also, did the MM calcs do optimizations? AF – No, no opts for MM confs. CC – JR – Could do some analysis of basin populations? AF – Roughly twice as many samples in alpha basin as in beta
AF – Anything else needed before we use this for a FF fit? JR – For confs used to determine partial charges, ones with internal hbonds are removed. But this dataset will have internal hbonds. So when fitting FFs it will struggle to agree because of limitations of charge models. CC: We’re using NAGL charges, not AM1-BCC LW – One concern with using internally hbonded mols in am1bcc calcs, and NAGL has been trained to a dataset where those hbonds are filtered out. We’ve looked at hydrogen parameters that protein FFs use and those have similar charges to what we use. CC – One more reason for ELF10 is numerical precision - if two charges things are close, it make the numerical assignment of the partial charge less precise.
|