(Slide 4): Pair has initial values set to parameters from Amber
Sage-Pair has initial values set to parameters from Sage
Direct comparison is Null-0.0.3-NAGL
(Slide 5): PB: was the width of the prior for torsions set to 5 kcal/mol?
CC: yes, set to the same value as 2.1
MS: with the pairwise objective fits, are they trained to data corresponding to the alpha helical part of the Ramachandran plot?
CC: yes
MS: what’s the secret between what 14sb was doing and us?
CC: there is a difference between the objective function that 14sb used vs us, where I added weighting to weight lower energy points higher
MS: how different is our training data?
CC: We don’t have their QM data, but it sounds generated in a similar way. They did 1D scans vs our 2D scans.
MS: Would be interesting to compute their obj function vs our obj function for these points.
CC: They’re also using MP2 as quantum method vs our DFT.
MS: do we think that makes a difference?
CC: unclear. Also, the backbone parameters were borrowed from older FFs that fit to lower QM methods. The bbs were re-fit to empirical scalar couplings, side chains re-fit to MP2 energies.
CC: found that training to gas-phase data gave worse properties. Tweaking gas-phase backbones to fit NMR scalar couplings resulted in better performance. At that point, AMBER helices were too stable and fitting to unstructured peptides helped – we have the opposite problem. Most recently they fit to implicit solvent DFT.
LW: do we train to just torsion scans or a wider array of geometries? Ties into AF’s topic below.
CC: yes we’ve discussed increasing our training data in a few different directions
MG: have we considered training to conformational preference data?
LS: is it correct that the torsional profiles you obtain from SMIRNOFF fits are very similar to AMBER? If the torsion profiles between fits are pretty similar with all the experiments you’ve been running, that would imply we might need fundamentally different inputs since we’re ending up in the same basin
MG: we discussed softening priors before
CC: we started from Sage initial values but haven’t run this experiment yet
LS: have we falsified that this error isn’t coming from NBs?
CC: yes, we’ve run experiments swapping NB parameters with AMBER
CC: (shows some older slides on torsion profiles around 30 min in). In general AMBER has highest differences to QM profile, Sage-CC is similar-ish to AMBER, and the Protein SMIRNOFF re-fits fit QM most closely
AF: it almost looks like the closer we fit to QM, the worse we do
MG: as CC points out, AMBER does fit to different method
LW: have you ever run benchmarks on Sage-CC?
CC: on short peptides yes, not the longer ones. Could easily run
MG: would it be possible manually tweak BBs to get better helices?
MS: reweighting would be the way to do this
MG: this is similar in spirit to what we’re currently doing, which is figuring out how to get the right answer from QM
LS: agree with all of above
MS: danger is we get it right for the wrong reasons. We don’t want to overstabilise, for example
LS: we currently have a great IDP force field
MS: happy to brainstorming reweighting approaches
Â
Fraction of native contacts
@Chapin Cavender
MS: can take stable segments of a simulation, e.g. 1 us, and tweak torsions to stabilize the folded states.
CC: issue was we didn’t have enough folded states in the SMIRNOFF force fields. The only trajectory long enough was Null with OPC, and we had a constraint that we needed a 3-pt water model
MS: FYI, if you look at densities and heats of mixing, OPC performs the worst with a systematic issue and TIP3P is not too bad
LS: did we ever try OPC3?
CC: have run benchmarks with OPC3
Â
Null-0.0.3-OPC
Â
@Anika Friedman
JW: for lysozyme BB, is there no data?
AF: we don’t have NMR data for lysozyme BB
CC: generally side-chain data is less accurate than BB
LS: did we just get unlucky picking GB3 as our target? Other targets look to perform within error
AF: a significant portion of GB3 is a-helix, so that has a significant contribution to error
LS: so is it because targets are less a-helical so less sensitive? Is it a convergence phenomenon, so if there’s more sampling BPTI would also unwind?
AF: BPTI is about same size as GB3
MG: BPTI has multiple disulfides, looks like it’s anchoring the helices
LS: if BPTI is more stable than GB3 in the FF, it might give deceptively good performance
Â
New QM data from PDB survey
@Anika Friedman
MG: what if we reduce or downweight the sidechain data to avoid it being used for BB fitting?
AF: CC has tried various weighting schemes. Doesn’t sound like the SC data is skewing the BB fits.
MG: so the oversampling in this region currently is not a problem?
CC: I think so
AF: the problem seems to be more that we’re not characterizing between the 15 degree intervals
CC: we could take 4-mers and do hierarchical clustering to characterize the multiple phi/psi angles present in the peptides
AF: sounds like a good idea
AF: do we just want to focus on a-basin? There are also regions in b-basin that aren’t sampled as thoroughly.
MG, CC: agree.
PB: why do we need to sample closely-spaced points in each basin?
AF: we may be missing minima for certain residue configurations. Also, these are 4-mers which give us more structural information