2024-08-07 FF fitting meeting

Participants

@Chapin Cavender
@Alexandra McIsaac
@Brent Westbrook (Unlicensed)
@Pavan Behara
William Swope
@Lily Wang

Goals

Discussion topics

Item	Presenter	Notes

Item

Presenter

Notes

Torsion multiplicity update

Slides:

Recording: Video Conferencing, Web Conferencing, Webinars, Screen Sharing
Passcode: qX2$Nd7t

BW

Added new training and benchmarking data to support new parameters
TM FF fit based on sage 2.2, performs equally well/better than 2.2 on existing industry benchmark dataset. Hard to know what to make of new benchmarks, small dataset so weird distributions
AMI: found that one good way to visualize changes is plotting the QM vs MM value
So far, just duplicating parent torsion, but may need to change periodicity etc for new parameters
BS: Slide 21, top left figure is just torsion parameter, but the rest are contributions from bonds, angles, etc?
- BW: yes
- BS: torsion energy term should be a small correction to torsion provided by nonbonded terms, so sometimes the FF term doesn’t look at all like the actual torsion drive, which shows full energy. Most of torsion energy comes from 1,4 interactions, so actual “torsion” term can be negative, have the wrong phase, etc
- BW: yeah, that’s been an issue with looking at these, because i don’t really know what to do with the torsion parameter plots
- BS: you can look at the torsion energy through non-bonded parameters
PB: are you going to look at residuals for next step? e.g. take out torsion and look at total energy (MM - torsion)
- BW: yes, that’s next
- AMI: I looked at this for some of the small ring torsions, it can also be hard to separate the error that a torsion term should correct, vs what is an error in another term
LW: what were the huge outliers in 2.1?
- BW: showed pictures in recording at 26 mins, one sulfamide and one phosphate. not in 2.2 or TM
BS: has anyone looked at how many molecules provide training data for each parameter?
- BW: yes that’s how i came up with new training data, looked at torsions that had low coverage. still some where there are only one/a few molecules
- BS: is there a rule of thumb for how many data points you need to really train it?
- LW: also complicated because more data means the fit takes longer, so being conservative
- PB: don’t train a parameter if it has less than maybe 5 records
- PB: gen 3 torsion training set constructed by hand to minimize number of molecules
- BW: i fragmented chembl for this, would be cool to construct molecules like this
- LW: similar to how XFF constructed their training data
- AMI: yes, looking at some of the parameters added in 2.1 especially some of the bonds and angles are also missing training data

2024-08-07 FF fitting meeting

Participants

Goals

Discussion topics

Action items

Decisions