Added new training and benchmarking data to support new parameters
TM FF fit based on sage 2.2, performs equally well/better than 2.2 on existing industry benchmark dataset. Hard to know what to make of new benchmarks, small dataset so weird distributions
AMI: found that one good way to visualize changes is plotting the QM vs MM value
So far, just duplicating parent torsion, but may need to change periodicity etc for new parameters
BS: Slide 21, top left figure is just torsion parameter, but the rest are contributions from bonds, angles, etc?
BW: yes
BS: torsion energy term should be a small correction to torsion provided by nonbonded terms, so sometimes the FF term doesn’t look at all like the actual torsion drive, which shows full energy. Most of torsion energy comes from 1,4 interactions, so actual “torsion” term can be negative, have the wrong phase, etc
BW: yeah, that’s been an issue with looking at these, because i don’t really know what to do with the torsion parameter plots
BS: you can look at the torsion energy through non-bonded parameters
PB: are you going to look at residuals for next step? e.g. take out torsion and look at total energy (MM - torsion)
BW: yes, that’s next
AMI: I looked at this for some of the small ring torsions, it can also be hard to separate the error that a torsion term should correct, vs what is an error in another term
LW: what were the huge outliers in 2.1?
BW: showed pictures in recording at 26 mins, one sulfamide and one phosphate. not in 2.2 or TM
BS: has anyone looked at how many molecules provide training data for each parameter?
BW: yes that’s how i came up with new training data, looked at torsions that had low coverage. still some where there are only one/a few molecules
BS: is there a rule of thumb for how many data points you need to really train it?
LW: also complicated because more data means the fit takes longer, so being conservative
PB: don’t train a parameter if it has less than maybe 5 records
PB: gen 3 torsion training set constructed by hand to minimize number of molecules
BW: i fragmented chembl for this, would be cool to construct molecules like this
LW: similar to how XFF constructed their training data
AMI: yes, looking at some of the parameters added in 2.1 especially some of the bonds and angles are also missing training data