Content Comparison

...

Recording:

https://drive.google.com/file/d/1OlaR3c6-GWzFuPEN7ha9V1uKdtHlFZQ_/view

Discussion topics

Item	Presenter	Notes
NAGL2 project plan	AMI	DM: (on testing error) error relative to what? AMI: to the testing dataset at QM, at the level of theory we’re training to. Basically lowering whatever error NAGL1 has JR: you use SFEs in the testing, not training, right? AMI: yes, we plan to train mostly to QM observables, possibly incorporating an actual charge model too. The first level is to reproduce the data we’re training on, and the second level is to evaluate the effect on simulations. JR: not clear to me if we’re reproducing dipoles from QM if we take those same charges and use them in SFEs AMI: we plan to account for polarisation using a RESP2 style approach of interpolating between vacuum and implicit solvent DM (in chat): Very minor thing, but “charge assignment speed must be faster than AM1-BCC” is unimportant for very small to small (< 300 Da?) molecules; it’s the better scaling/better speed for larger molecules that becomes essential. AMI: good point, could phrase in terms of scaling instead Virtual sites: Would be awesome if we essentially had two models, otherwise trained the same way, which do with/without virtual sites. BS: just to be clear, current set goes to iodine? AMI: yes JR: I think we really need vsites DM: There’s no question we really need them, but we can’t suddenly move to all vsite-based FFs since current free-energy infrastructure doesn’t support them, so we need to support 2 lines of FFs for a while. JR: so basically the MD packages we have can’t handle vsites? DM: RBFEs are complex with vsites – if you perturb an atom adjacent to a group with virtual sites, it’s pretty complex to nail down the effect. Vanilla MD is fine, free-energies are hard. CC: could I ask for more details about RESP2 style incorporation of solvent? AMI: we’re calculating molecules in vacuum, and one in PCM. We will follow the averaging scheme of RESP2 paper, where we train individual NNs to each and combine them in some way (either training the interpolation as well, or just straight-up averaging it). BS: what was the resolution on the diffuse functions with PCM? AMI: still waiting on dataset to compute, but help from Trevor helped convergence. If this last try doesn’t work I’ll drop the diffuse functions. Discussion: https://openforcefieldgroup.slack.com/archives/C8P54J2JY/p1728509641737989 JR: we only have side-chain analogues in the IPolQ paper, but would it be useful to look at the charges between the IPolQ model and the RESP2/NAGL2 versions? I’m concerned that water calculation is happening with implicit solvent, and the impact of H-bonding would not be present. Would be good to check against RESP2 at 50% or 70% delta, at same level of theory as IPolQ. DM: I think this would be interesting. We would likely find cases where IPolQ would be better because it explicitly describes H-bonding. However, it’s not clear what the consequences would be. Scaling an IPolQ workflow would be a lot of work. Perhaps in the short-term we can say that AM1-BCC is bad, and improve our electrostatics, as an intermediate step before heading to IPolQ-like methods. DM: My group is going to test out ABCG2. (General): it’s an interesting check to determine the difference between explicit solvent and PCM. JR: one other metric that might be helpful is the molecular dipole moment. CC: it might be interesting to mix water PCM with other low-dielectric organic solvent PCM, which matches internal environment of proteins better. Some work from DCole shows that behaves better. Probably low priority but worth thinking of if we have extra QM. AMI: do you mean actually mixing models from gas and low dielectric solvent, or … (see recording ~30 min) CC: mean training one set of charges with low dielectric solvent and mixing with water. JR: might stabilise overpolarised charges AMI: I think DCole’s work is doing the same approach with gas and water. CC: I think they tried it with MBIS charges alone, not with a NN BS: I like that you’ll be generating a large training set of ESPs, that you’ll be using to fit vsites too. CC: Could you consider training the NN to spit out a confidence assessment, e.g. with mean and variance, following on LW’s idea of a tool to assess similarity between training set and query molecules for NAGL1. AMI: can look into that

...

Version	Old Version 2	New Version Current
Changes made by	Lily Wang	Brent Westbrook
Saved on	Oct 30, 2024	Oct 30, 2024

Versions Compared

Key

Discussion topics