2023-08-03 Force Field Release Meeting notes

 Date

Aug 3, 2023

 Participants

  • @Lily Wang

  • @Michael Shirts

  • @Alexandra McIsaac

  • @Brent Westbrook

  • @Chapin Cavender

  • @Jeffrey Wagner

  • @Jeffry Setiadi

  • @David Mobley

  • @Christopher Bayly

  • Bill Swope

  • @Pavan Behara

  • @Matt Thompson

Recording: https://drive.google.com/file/d/1tbiXYk1VAEMmtAncnEaFEqECHrDbJSZv/view?usp=drive_link

 Discussion topics

Item

Notes

Item

Notes

GNN update

 

  • CBy – AT does amazing on ESP RMSE - This is AT AM1BCC? That’s remarkably close.

    • LW – Yes, though do note how narrow the whole x range is. The difference may not be huge.

  • CBy – I’m pretty pleased by these results, not surprised you’re getting worse agreement with charges. When you train to all objectives, did you try training only to ESP and dipole moment?

    • LW – Not yet, but I do think it’s important to weight the training data equally.

    • CBy – My intuition is that I’d love to see what happens if you only train to ESP and dipole moment. The dipole moment and ESP are related, and the dipole has 3 dof when the direct point charge comparison is n_atoms…

    • BS – You can think of dipole moment as first moment of e- distribution. If the points at which you’re evaluating the ESP extend far from the mol, you have a lot of points that are sensititve only to the first moment, so ESP points that are far out will basically be dipole again.

    • CBy – Using Merz-Kollman shells, at something like 2x the vdW radius. So that could be dominated by the dipole.

  • CBy – Also, what do you do with formal charges? The embedding technique works well with neutral molecules, but once there’s a formal charge, that affects everything. So the metrics will be sensitive to how the formal charge is distributed over the charge centers. That may need some special attention.

    • LW – Right now the total formal charge is a constraint in the fitting scheme - I understand that this is one of the flaws of this scheme is that the charge can get smeared all over the mol. I have noticed that the model works far better on neutral mols, and that also comes down to the training data (most of the set is neutral, few are doubly charged)

    • CBy – The first moment of the ESP isn’t actually the dipole, it’s the center of charge. So getting that right will be essential to getting the ESP right. This reminds me of Riniker approach.

    • BS – Re: dipole moment - If you have a charged system, the dipole depends on the origin of the system. So it’s only when the system is neutral that the dipole is origin-independent…. need to consider center of charge…

    • LW – Agree, if we move this to higher level QM these are important considerations.

    • BS – Worth mentioning that OPC water was largely based on ESP fits.

  • CBy – The results you’re seeing with HFEs are similar to what I saw before I did two-stage RESP fits, and was considering each ESP point to be equal. So when you compute RMSE, are you having errors on the most polar parts of the surface where you need the highest fidelity? In my two-stage fits, my best-performing model on HFEs would NOT weight all the points equally. So it’d be worth training the charge model to do well on high-magnitude points. But this didn’t work as well as the two-stage fit did. In the two stage fits, it took the highest and lowest magnitude points…

    • LW – It sounds like the problem being solved by that is a low RMSE ESP but a high RMSE HFE. Whereas here I pickjed things that score BADLY on ESP RMSE and checked their score in other metrics…

    • MS – These are molecules that are performing in SPICE for the GNN?

    • LW – Yes. Lots of Ps and Ss in the outliers. These are underrepresented in the training set.

  • DM – Thoughts on release timeline/criteria?

    • LW – One of the final steps we should do is retraining the vdW terms for use with Nagl.

    • DM – It’s not obvious to me that we should do that. If we’re using it as a replacement for AT, then the vdW terms should be transferrable.

    • LW – That had been my thought before, but in terms of promising comparability to AT, it may be hard to get all chemistries within that window. And I think for our internal purposes, it would be worth retraining vdW to see whether it makes a big difference.

    • MS – If things are roughly the same, then I don’t anticipate we’ll see a big difference. There may be some outliers but I doubt they’ll change substantially.

    • DM – If we do retrain, then would we release a FF only intended with use for Nagl?

    • LW – If they’re substantially different, then we should release it as a different charge model.

    • MS – Since we don’t reoptimize the vdW for AT vs OE, we probably shouldn’t reoptimize here.

    • DM – It would be helpful to know whether the vdWs change substantially before we make this call, especially if it results in substantially better property prediction.

    • MS – Then we’d have to …

    • LW – Would it be hard on an infra level to have NAgl provide AM1BCC charges?

    • JW – No, should be straightforward to just line it up alongside OE and AT as an alternative charge provider. And then we wouldn’t need to specify how to define a NN in the SMIRNOFF spec.

    • JW – Also, on a product-management level, FFs using a nagl-AM1BCC backend wouldn’t live for long, since we’d move on to higher-quality charges quickly.

      • CBy – Agree, I’m excited to move on to training this on higher level QM.

      • MS – Is there enough QM ESP data to train the GNN?

      • LW – Two answers - I think this QM dataset is quite small for training a NN. Another thing is, if we’re training to QM, there’s no particular reason it has to be HF-631G*. I was talking to SBoothroyd about using the QMUGS dataset which is quite promising. That has wavefunctions stored so we can generate ESPs.

      • MS – Can we do this using transfer learning?

      • LW – Yes, that would be possible. The danger would be that we might get embedded in a minimum. I’ve been looking more into the vsite refit.

  • LW – Next steps to me are looking at more datasets - Get more chemistries that we don’t cover very well. Also thinking about putting in warning systems to tell the user if we don’t think the model would work. When Rosemary is in a good state to be passed on to me, I’d like to do PL binding benchmarks. So most of these are development and not so much performance.

    • MS – If it’s possible to release this as a plugin replacement that isn’t guaranteed to be a plugin replacement for AM1BCC, but is fast.

    • JW – I see three stages of releasing this

      • (where we are now) Have NAGL available behind a private API point

      • Put NAGL in the public API, but not use it in the default system generation

      • Have create_interchange reach for NAGL by default

    • (JW + LW) – Decision: We’ll move to “option 2” above - put the 0.3.0 release of NAGL in the public API, but not have create_interchagne reach for it by default.

    • MT – Stage 3 will make a headache in Interchange - There’s already try/except logic in Interchange to handle the ToolkitAM1BCC tag (use ELF10 if OE is present, use AT single-conf if not). So putting that in and having some sort of “bad nagl chemistry guards” will be tough.

      • MT: Here’s the current code, IMO it scales poorly with number of options if we want it to automagically try a bunch of options with opinionated human-crafted behavior; checking to see which version of NAGL is available, loaded, whether it’s applicable to this molecule, whether it failed to assign charges, etc. will result in a lot of deeply nested logic.

  •  

  •  

 

 

 

 

 

 

 

 

 

 

 Action items

 Decisions