DM: is generating QC data mostly effort from you, or computational effort?
CC: mostly computational
DM: could we start the QC data in the pipeline just in case we need it – we probably will eventually, even for small molecule force fields, while we work on other solutions? Is the NMR benchmarking working?
CC: still in progress, nearly there
DM: could you do solution #2 while you’re fixing that then? If points 2 and 3 are mostly independent of your personal time, you could start firing away while trying to get 1 done
CC: agree – but is point 2 worth doing without point 3?
CC: takes 10 days for re-fit
DM: do we have any small molecule TorsionDrives that scan omega?
CC: yes, think it’s already in the training set I’m using. Should already be in the parameter fits I’ve done
MG: I thought the concern was that the small molecule training data doesn’t really have QC data for this amide bond?
CC: there’s no full scan of the amide bond – there’s a motion out of plane to move to -+30. There’s nothing that moves from trans to cis
PB: not sure if we have a full scan of amides. I’ll check
MG: was this raised because industry noticed flips in the amide bonds?
DM: more because our well depths aren’t accurate. We couldn’t figure out how to fix this wtihout doing more science on it. e.g. at the time we weren’t fitting impropers
MG: if we have impropers operational now, how come Sage is getting those wrong and passing the errors onto the protein fit?
DM: we haven’t fit impropers yet
MG: do we need to fit impropers to fix this?
DM: we still have a very minimal number of them and as I recall they still haven’t been fit
MG: we may need more than any of the proposed solutions to get our torsions where we want them
CC: I’m not confident these will fix the problem, it’s my intuition that these will help
PB: I think you’re already starting from a baseline FF that has re-fit impropers. No improper scans used in fitting those – just optimised to geometries etc
CC: I thought we had data with improper scans
DM: I agree with MG that there’s a science issue and we’re not sure if these will solve the problem
MG: how does Amber get it right?
CC: don’t think they did anything special for proteins there – it probably comes from inherited GAFF parameters
PB: Do they have extra terms beyond SMIRNOFF terms?
CC: don’t know off the top of my head – will check.
MG: given that we don’t have confidence in the other solutions, IMO we should push ahead with the NMR studies and pursue others in the meantime. At least it looks better than Sage. We shouldn’t make a science problem a roadblock if we can avoid it
DM: and set up QC data in the pipeline
JW: we have basically all our QC compute available for work with any kind of priority
CC: do we want to generate scans for peptides, or small molecule amides, and generalise those to proteins?
DM: prioritise protein parameters you need for your MVP, if we have to we can inherit small molecule params from those
PB: I think we do have scans for small molecule amides. It may be a typing issue
MG: so do we need to include impropers as fittable terms?
DM: unsure – we can work it out for proteins, and inherit the solution for small molecules
MG: what’s the issue with impropers? Why are they the problem?
PB: we’re not distinguishing between cis- and trans-configurations properly, and the barrier is pretty low
CC: it’s just a hunch. This solution was proposed to do a re-fit without generating new data
MG: so is there an issue with small-molecule omega amides? So were impropers not being fitted there? Were they fit in Sage? Why not?
PB: No, they weren’t re-fit in Sage
DM: because we have very few impropers that cover a lot of chemistry, so we were worried about messing with that. But when we had issues with torsions and studied those, we held impropers fixed, so now we’re wondering if fitting impropers would help
MG: would it help to add another torsional term?
DM: it’s possible
MG: how many torsions does Amber have?
CC: will check
CC: agree with MG that if we’re going to generate new QC data, we can directly refit torsions and skip re-fitting impropers
PB: can you do bespoke fits (using forcebalance) on single molecules and check how force constants vary from what’s in the protein specific model? High variance could suggest we need to split torsions for different periodicities/etc. You can also check if any angles are causing any issue, i.e. if torsions are not the cause
CC: will do