2024-08-14 FF fitting meeting

Participants

Brent Westbrook
Alexandra McIsaac
Bill Swope
Jeffrey Wagner
Chapin Cavender
David Mobley
Pavan Behara
Lily Wang
Michael Shirts
Barbara Morales
Julianne Hoeflich
Patrick Frankel
Megan Osato
Trevor Gokey

Goals

Discussion topics

Item	Presenter	Notes
Diffusion and mixture properties	Shirts group	MS: regressing to previous infrastructure has been problematic in running SFEs Barbara (first presenter) Mixture property validation of water Looking at Hmix, density for 7 water models, wants to look at SFE but infra issues Binary mixtures with water, using Sage 2.1 TIP3P “best” water model for alcohols, but still high error TIP4P/TIP3P best for amines and both alc/amine DM: Looks like confidence intervals overlap for different RMSE’s, seems hard to tell which is “best” based on RMSE [MS/BM agree] MS: The errors are bootstrapped over molecule DM: May find that another analysis would be helpful, maybe paired t-test, to see statistics on by-molecule basis rather than aggregating over whole dataset [recording around 10:30--not confident I captured this correctly] CC--need a multiple hypothesis correction for paired t test of all pairs Want to re-train LJ with TIP3P_FB and OPC3 or maybe 4-point model, and check OpenFF 1.0 JW: I would think OPC would be better than TIP4P, but you see OPC is the worst--is that expected? MS: a bit of a surprise. TIP4P_FB and OPC have similar properties in pure water, but OPC is bad on mixtures DM: why is r2 negative? BM: happens when correlation is really bad, using scikit, docs just said it represents very bad correlation [https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html ] DM: thought this would be pearson R, which should always be positive, do you know which r measure it is? BM: I’ll look into it MS: May be worth doing the fit with your own code DM: looks like this is a coefficient of determination MS: reason we want to try Parsley 1.3 is because it’s before we re-trained our LJ parameters with TIP3P, so other models might be better BS: could also train charges CC: BCC in AM1BCC could be trained MS: maybe after we do this, would want to co-tune it with LJ MS: also want to try SFE’s with nonpolar molecules MS: goal of this is: should we change our water model? If TIP3P FB and TIP4P are already close to TIP3P, maybe if we reopt LJ for these models, they’d perform even better MS: doesn’t look like 4 point models are better BS: depends what you want--TIP3P doesn’t have some properties MS: yes, TIP4P is more close to real water. TIP3P FB and OPC3 are closer to 4-point model performance than TIP3P BS: TIP4P doesn’t give right density using Ewald methodologies MS: some of that is corrected in TIP3P FB and OPC3 JW: I see switching the water model as something we want to do, but only once. Would make sense to align timeline-wise with release of protein FF. Additionally, could think about making our own water model, I think the offxml environment is so alien to people that they won’t care if it’s an existing water model or not MS: do we want to have an intermediate where we recommend an existing water model? DM: don’t think it matters, as long as it’s good. Don’t think people will refuse to use our water model JW: Disagree, I think it’s much easier to release it as one release rather than going through intermediates DM: sure, I mostly meant I don’t think it matters if we also support another water model. People will use whatever we say to use CC: Some molecules will exist as charged species in water (eg primary amines), are you doing anything to account for that BM: no, should look at that CC: I don’t think we did either for Sage, kind of a gap MS: Need to look at pKa and see if it’s near pH 7 CC: I think for primary amines, they will be MS: Not just pKa but how does it change as a result of composition BS: have some notes about treating this, who should i send it to MS: either put in slack for everyone or email to me and I’ll forward it MS: Is there a good pKa predictor for small molecules, or do you have to do QM? BS: probably has been measured DM: if not, pKa prediction is really hard PB: qupKake could work https://pubs.acs.org/doi/10.1021/acs.jctc.4c00328 Julianne: Slow diffusion in lipid simulations Overall slow diffusion in lipid simulations, much slower lateral diffusion than MacRog and Slipids Think it’s due to alkane tail behavior Lipid tails are 6-18 C Neither Slipids nor MacRog uses HMR Calculate D from simulations; Sage 2.1, HMR is slower than non-HMR but both are slower than expt JW: to be clear, even with small tail length, still have head groups with ~10 heavy atoms? MS/JH: no, we’re just looking at alkane tails MS: HMR is reducing diffusion constant, COM not affected by HMR but dynamics of things twisting/rotating are affected/slowed down due to moving moment of inertia JH: Amber’s most recent lipid FF mentions they have to fine tune C-C-C angle for alkanes, which drastically affected lipid diffusion, after tuning the angle they re-trained torsions which helped a lot D underestimated worse as chain length grows; up to 20% of the diffusion constant for 15 C MS: expect this if barrier is too high Density is pretty accurate TIP3P has results you’d expect for D and density, suggesting it’s not the problem BS: you left out TIP3P D, it’s 3, you predict 6… JH: yeah, it’s true. but we’re looking at alkanes for now, shouldn’t affect it too much Diffusion does not always increase with box size as it would be expected to do, not sure if that’s OK MS: maybe too uncertain? But that seems unlikely CC: is that a finite size effect? MS: yes https://pubs.acs.org/doi/10.1021/jp0477147 Next steps: re-fit angles/torsions for CCC, then re-run and see if it increases D maybe use QM data or expand dataset, existing torsions aren’t really trained to linear alkanes BW: expecting ~1 week for new dataset TG: If you’re going to do angles, I’d suggest splitting C-C-C vs C-C-H. Currently combined JH: why would those be together…? TG: doesn’t look super different/worth splitting, but I’ve found it’s important BW: I think we tried splitting this and didn’t see much effect? LM: Maybe didn’t affect RMSD/ddE but would affect other things?
One torsion shape	BW	MS: if angles are so dominant, does it mean it’s not properly minimized? JW: usually high angle/vdW would mean it’s a sterics clash SMIRKS string is [#6X3:1]=[#7X2,#7X3+1:2]-[#6X4:3]-[#6X3,#6X4:4], C-NX3-C-C BS: are you sure dark blue dotted line is angle and not vdW 1-4? BW: not 100% sure, but pretty sure

Participants

Goals

Discussion topics

Action items

Decisions