2024-04-17 FF Fitting Meeting

Participants

  • @Christopher Bayly

  • @Pavan Behara

  • @Chapin Cavender

  • @Alexandra McIsaac

  • @David Mobley

  • Bill Swope

  • @Matt Thompson

  • @Jeffrey Wagner

  • @Lily Wang

  • @Willa Wang

  • @Brent Westbrook (Unlicensed)

Goals

  •  

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Sage 2.2 benchmarks

LM

Recording: Video Conferencing, Web Conferencing, Webinars, Screen Sharing
Passcode: y?=4^5?@

 

Slide “Algorithm differences”

  • PB - What is “old code”? We used to do some of the things you list only under “New code”.

    • LM - “Old code” is the initial code in this yammbs repo: . We introduced some bugs in this repo which are now fixed.

  • CB - Geometry optimization will only move to a new conformer if the energy surface is flat.

    • LM (next slide) - Here’s an example that shows this is a problem with the convergence criteria.

    • CB - This is a problem with your optimizer, not a problem with the convergence criterion. Your optimizer is not stopping at the local minimum.

    • LM - BFGS isn’t a local optimizer, so it’s not expected to stop at the nearest local minimum.

    • BS - The plot on the left shows that there’s not a barrier, it’s just a long shoulder to a local minimum that’s far away.

    • CB - I agree that your previous criteria was too tight, but I’m worried that the default criterion is too loose

    • LM - The default is the standard used in many software packages, e.g. Gaussian

    • (All) - We’re comfortable with the new criteria

Slide “Whole dataset, problem molecules removed” with Log10 RMSD plots

  • CB - On right plot, it looks like Sage 2.0 peak is shifted to the left compared to Sage 2.1 and 2.2. Which is better?

    • LM - RMSD looks at whole molecule geometry and doesn’t say much about internal ring angles fixed in Sage 2.2

    • JW & LM - This is on a log10 scale, so we shouldn’t look at details of curves less than Log10 RMSD = -1. CDF plots should be more informative.

    • LM - Main takeaway here is that the new approach to benchmarking (right plot) improves over the old approach (left plot)

    • LW - A quantitative metric for whether outliers are reduced is the standard deviation (last column), which is lower for Sage 2.2 relative to Sage 2.1

    • BS & CB - RMSD might not be a good metric because of changes in conformers or lever arm effects

Slide Log10 TFD

  • BS - What are the units for TFD?

    • LM - TFD is unitless with range (0, 1)

Slide “3-membered rings” with Bonds and Angles

  • CB - Looks like degradation of performance from Sage 2.0 to 2.2

    • LM - Sage 2.0 has a weird equilibrium angle for epoxy rings

    • DM - Sage 2.2 has a clear improvement in whole molecule geometry, but this specific chemistry might be marginally worse

    • LM - Sage 2.0 has problem chemistries fixed in 2.1 and 2.2. My focus in these slides is on the fixed problems in our standardized benchmarking code, not making the case for why Sage 2.2 is better than 2.0

Slide “Sulfamides” with Log10 RMSD and Log10 TFD

  • BS - These plots are bimodal. It would be interesting to know what separates molecules in each peak.

  • LM - Haven’t looked at this in detail, but my guess is that the change from Sage 2.0 involved tightening the prior so that it stays close to the MSM value. This might cause good behavior for some conformers but not others.

Action items

Decisions