2024-07-25 Protein FF meeting note

Participants

@Chapin Cavender
@Michael Gilson
@Brent Westbrook (Unlicensed)
@David Mobley
@Lily Wang
@Pavan Behara
Louis Smith
@Jeffrey Wagner

Goals

Update on QM parameter fits

Recording

https://drive.google.com/file/d/1pHHoklRH9QNaM-OR8l-VeJR0bQ0X2ZDO/view?usp=sharing

Discussion topics

Item	Presenter	Notes

Item

Presenter

Notes

QM parameter fits

@Chapin Cavender

LS – Frustrated exploration of parameter space is interesting to me. I thought you were doing something to allow for global optimization in parameter space. But are you just using priors to do a convex minimizaton to find a local minimum in parameter space?
- CC – We’re using LBFGS
- LS – Isn’t that local?
- MG – Line optimization can jump minima, and I think I’ve seen BFGS jump minima
- DM – But that’s not operating as intended.
- LS – Right
- CC – It’s not a greedy local minimizer, like steepest descent. I think BFGS is somewhat better at dealing with high-dimensional space and skipping over local minima
- LS – So it can skip over some saddles… I could see that. Is the best way to explor parameter space to relax your priors, or to do a global search or use a genetic algorithm, or basin-hopping?
- CC – I do think it’s optimal to do a basin-hopping algorithm, but that’s a substantial deviation from how we’re set up to do fits. PB had tried this.
- PB – I tried basin-hopping and it was too slow for me. TG used it a bit for his simple small-scale fit and had more luck. Just used scipy optimizer in FB.
- LS – Gotcha, I understand if it’s too hard to try. Seems like it may be straightforwardish by doing some sort of parallel tempering.
- CC – It’s not so much a code problem as it’s a walltime problem. Evaluating a single step of objective function takes 12ish hours.
- LS – Can something be done about runtime?
- DM – Possibly couples to reweighting issues.
- MG – Relates to liquid optimizations too. But generally I think our opts are a bit too conservative, in terms of using priors.
- DM – I think 12 hours is without condensed phase. We’ve also tried deriving starting parameters using modified seminario and other methods.
- CC – No, the only other thing I’ve tried is using the ff14SB parameters as a starting point. That’s what I’ve been using for the “specific” fits, up until the “specific-003-sage-pair” FFs I showed for the first time today. Was also thinking that we could use GB3 trajs to train FF, but then we’d need another benchmark protein for testing.
- LS – That sounds cool to me.
- CC – We already have other small proteins available.
- DM – Another way to do it would be to not FIT to GB3 data, but instead reweight GB3 data to rapidly evaluate. FF candidates.
- MG – That’s kinda what we’re doing. I think fitting to GB3 trajs could be really useful in getting us to a new region of parameter space.
- DM – I wonder why this is so hard
- MG – Possibly the issue is fitting to gas phase structures. This would be expected to introduce significant errors. I’m a little surprised it’s working at all.
- CC – Largely agree. ff14SB was successful because they moved away from using QM and used scalar couplings. And FF19 fit to solution phase QM. But I’m thinking that we could keep fitting to gas phase and fit to GB3 traj to break into a new minimum
- MG – Woldn’t be a big sin to fit to condensed phase protein data, we already do this for small mols
- LS – Can we do continuum QM?
- …
- MG – Downside would be that QM takes a long time.
- LS – I’ve used solvation models in QM before.
- JW(chat) – Josh Horton just added DDX (a solvent model) support to QCSubmit https://github.com/openforcefield/openff-qcsubmit/pull/290
- LS – I’m not super confident about resource requirements, may be heavily implementation specific
- JW – Would this require implicit solvent in MM as well? I can’t recall how much mileage we’ve put on our GBSA implementation.
- (General) – Yes
- JW – I think we’d have room for this in the QC queue.
- CC – Could do a limited set of input confs, like a bunch of 1D torsiondrives.
- LS + MG – Could do a limited submission to estimate the difference
- CC – Could do a permutation of phi and psi ranges, with additional points emphasizing alpha and beta basins.
- JW – It’s likely we have the compute for this, but the serial nature may make big torsiondrives take a while.
- CC – Ok, I’ll try to queue up a submission for next week.
PB (chat) – @Chapin Cavender are the optimized torsion k values comparable to ff14sb values?
- CC – Hard to directly compare - ff14SB’s torsion k’s are all positive, but some of ours are negative. For periodicity=1 this can be equivalent, but not for other periodiciites. But looking visually at plots, these seem comparable.

Meetings

2024-07-25 Protein FF meeting note

Participants

Goals

Recording

Discussion topics

Action items

Decisions

Related content