Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Recording:

https://drive.google.com/file/d/1OlaR3c6-GWzFuPEN7ha9V1uKdtHlFZQ_/view

Discussion topics

Item

Presenter

Notes

NAGL2 project plan

AMI

  • DM: (on testing error) error relative to what?

    • AMI: to the testing dataset at QM, at the level of theory we’re training to. Basically lowering whatever error NAGL1 has

  • JR: you use SFEs in the testing, not training, right?

    • AMI: yes, we plan to train mostly to QM observables, possibly incorporating an actual charge model too. The first level is to reproduce the data we’re training on, and the second level is to evaluate the effect on simulations.

    • JR: not clear to me if we’re reproducing dipoles from QM if we take those same charges and use them in SFEs

    • AMI: we plan to account for polarisation using a RESP2 style approach of interpolating between vacuum and implicit solvent

  • DM (in chat):

    • Very minor thing, but “charge assignment speed must be faster than AM1-BCC” is unimportant for very small to small (< 300 Da?) molecules; it’s the better scaling/better speed for larger molecules that becomes essential.

      • AMI: good point, could phrase in terms of scaling instead

    • Virtual sites: Would be awesome if we essentially had two models, otherwise trained the same way, which do with/without virtual sites.

  • BS: just to be clear, current set goes to iodine?

    • AMI: yes

  • JR: I think we really need vsites

    • DM: There’s no question we really need them, but we can’t suddenly move to all vsite-based FFs since current free-energy infrastructure doesn’t support them, so we need to support 2 lines of FFs for a while.

    • JR: so basically the MD packages we have can’t handle vsites?

    • DM: RBFEs are complex with vsites – if you perturb an atom adjacent to a group with virtual sites, it’s pretty complex to nail down the effect. Vanilla MD is fine, free-energies are hard.

  • CC: could I ask for more details about RESP2 style incorporation of solvent?

    • AMI: we’re calculating molecules in vacuum, and one in PCM. We will follow the averaging scheme of RESP2 paper, where we train individual NNs to each and combine them in some way (either training the interpolation as well, or just straight-up averaging it).

  • BS: what was the resolution on the diffuse functions with PCM?

  • JR: we only have side-chain analogues in the IPolQ paper, but would it be useful to look at the charges between the IPolQ model and the RESP2/NAGL2 versions? I’m concerned that water calculation is happening with implicit solvent, and the impact of H-bonding would not be present. Would be good to check against RESP2 at 50% or 70% delta, at same level of theory as IPolQ.

    • DM: I think this would be interesting. We would likely find cases where IPolQ would be better because it explicitly describes H-bonding. However, it’s not clear what the consequences would be. Scaling an IPolQ workflow would be a lot of work. Perhaps in the short-term we can say that AM1-BCC is bad, and improve our electrostatics, as an intermediate step before heading to IPolQ-like methods.

    • DM: My group is going to test out ABCG2.

    • (General): it’s an interesting check to determine the difference between explicit solvent and PCM.

  • JR: one other metric that might be helpful is the molecular dipole moment.

  • CC: it might be interesting to mix water PCM with other low-dielectric organic solvent PCM, which matches internal environment of proteins better. Some work from DCole shows that behaves better. Probably low priority but worth thinking of if we have extra QM.

    • AMI: do you mean actually mixing models from gas and low dielectric solvent, or … (see recording ~30 min)

    • CC: mean training one set of charges with low dielectric solvent and mixing with water.

    • JR: might stabilise overpolarised charges

    • AMI: I think DCole’s work is doing the same approach with gas and water.

    • CC: I think they tried it with MBIS charges alone, not with a NN

  • BS: I like that you’ll be generating a large training set of ESPs, that you’ll be using to fit vsites too.

  • CC: Could you consider training the NN to spit out a confidence assessment, e.g. with mean and variance, following on LW’s idea of a tool to assess similarity between training set and query molecules for NAGL1.

    • AMI: can look into that

...