Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Participants

Goals

Discussion topics

Item

Presenter

Notes

Diffusion and mixture properties

Shirts group

  • MS: regressing to previous infrastructure has been problematic in running SFEs

  • Barbara (first presenter) Mixture property validation of water

    • Looking at Hmix, density for 7 water models, wants to look at SFE but infra issues

    • Binary mixtures with water, using Sage 2.1

    • TIP3P “best” water model for alcohols, but still high error

    • TIP4P/TIP3P best for amines and both alc/amine

    • DM: Looks like confidence intervals overlap for different RMSE’s, seems hard to tell which is “best” based on RMSE [MS/BM agree]

      • MS: The errors are bootstrapped over molecule

    • DM: May find that another analysis would be helpful, maybe paired t-test, to see statistics on by-molecule basis rather than aggregating over whole dataset [recording around 10:30--not confident I captured this correctly]

      • CC--need a multiple hypothesis correction for paired t test of all pairs

    • Want to re-train LJ with TIP3P_FB and OPC3 or maybe 4-point model, and check OpenFF 1.0

    • JW: I would think OPC would be better than TIP4P, but you see OPC is the worst--is that expected?

      • MS: a bit of a surprise. TIP4P_FB and OPC have similar properties in pure water, but OPC is bad on mixtures

    • DM: why is r2 negative?

      • BM: happens when correlation is really bad, using scikit, docs just said it represents very bad correlation [https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html ]

      • DM: thought this would be pearson R, which should always be positive, do you know which r measure it is?

      • BM: I’ll look into it

      • MS: May be worth doing the fit with your own code

      • DM: looks like this is a coefficient of determination

    • MS: reason we want to try Parsley 1.3 is because it’s before we re-trained our LJ parameters with TIP3P, so other models might be better

    • BS: could also train charges

      • CC: BCC in AM1BCC could be trained

      • MS: maybe after we do this, would want to co-tune it with LJ

    • MS: also want to try SFE’s with nonpolar molecules

    • MS: goal of this is: should we change our water model?

      • If TIP3P FB and TIP4P are already close to TIP3P, maybe if we reopt LJ for these models, they’d perform even better

      • MS: doesn’t look like 4 point models are better

      • BS: depends what you want--TIP3P doesn’t have some properties

      • MS: yes, TIP4P is more close to real water. TIP3P FB and OPC3 are closer to 4-point model performance than TIP3P

      • BS: TIP4P doesn’t give right density using Ewald methodologies

      • MS: some of that is corrected in TIP3P FB and OPC3

    • JW: I see switching the water model as something we want to do, but only once. Would make sense to align timeline-wise with release of protein FF. Additionally, could think about making our own water model, I think the offxml environment is so alien to people that they won’t care if it’s an existing water model or not

      • MS: do we want to have an intermediate where we recommend an existing water model?

      • DM: don’t think it matters, as long as it’s good. Don’t think people will refuse to use our water model

      • JW: Disagree, I think it’s much easier to release it as one release rather than going through intermediates

      • DM: sure, I mostly meant I don’t think it matters if we also support another water model. People will use whatever we say to use

    • CC: Some molecules will exist as charged species in water (eg primary amines), are you doing anything to account for that

      • BM: no, should look at that

      • CC: I don’t think we did either for Sage, kind of a gap

      • MS: Need to look at pKa and see if it’s near pH 7

      • CC: I think for primary amines, they will be

      • MS: Not just pKa but how does it change as a result of composition

      • BS: have some notes about treating this, who should i send it to

      • MS: either put in slack for everyone or email to me and I’ll forward it

      • MS: Is there a good pKa predictor for small molecules, or do you have to do QM?

      • BS: probably has been measured

      • DM: if not, pKa prediction is really hard

      • PB: qupKake could work https://pubs.acs.org/doi/10.1021/acs.jctc.4c00328

  • Julianne: Slow diffusion in lipid simulations

    • Overall slow diffusion in lipid simulations, much slower lateral diffusion than MacRog and Slipids

    • Think it’s due to alkane tail behavior

    • Lipid tails are 6-18 C

    • Neither Slipids nor MacRog uses HMR

    • Calculate D from simulations; Sage 2.1, HMR is slower than non-HMR but both are slower than expt

    • JW: to be clear, even with small tail length, still have head groups with ~10 heavy atoms?

      • MS/JH: no, we’re just looking at alkane tails

    • MS: HMR is reducing diffusion constant, COM not affected by HMR but dynamics of things twisting/rotating are affected/slowed down due to moving moment of inertia

    • JH: Amber’s most recent lipid FF mentions they have to fine tune C-C-C angle for alkanes, which drastically affected lipid diffusion, after tuning the angle they re-trained torsions which helped a lot

    • D underestimated worse as chain length grows; up to 20% of the diffusion constant for 15 C

      • MS: expect this if barrier is too high

    • Density is pretty accurate

    • TIP3P has results you’d expect for D and density, suggesting it’s not the problem

      • BS: you left out TIP3P D, it’s 3, you predict 6…

      • JH: yeah, it’s true. but we’re looking at alkanes for now, shouldn’t affect it too much

    • Diffusion does not always increase with box size as it would be expected to do, not sure if that’s OK

    • Next steps:

      • re-fit angles/torsions for CCC, then re-run and see if it increases D

      • maybe use QM data or expand dataset, existing torsions aren’t really trained to linear alkanes

        • BW: expecting ~1 week for new dataset

    • TG: If you’re going to do angles, I’d suggest splitting C-C-C vs C-C-H. Currently combined

      • JH: why would those be together…?

      • TG: doesn’t look super different/worth splitting, but I’ve found it’s important

      • BW: I think we tried splitting this and didn’t see much effect?
        LM: Maybe didn’t affect RMSD/ddE but would affect other things?

One torsion shape

BW

  • MS: if angles are so dominant, does it mean it’s not properly minimized?

  • JW: usually high angle/vdW would mean it’s a sterics clash

  • SMIRKS string is [#6X3:1]=[#7X2,#7X3+1:2]-[#6X4:3]-[#6X3,#6X4:4], C-NX3-C-C

  • BS: are you sure dark blue dotted line is angle and not vdW 1-4?

    • BW: not 100% sure, but pretty sure

Action items

  •  

Decisions

  • No labels