2021-01-15 WBO/Impropers meeting notes

Participants

@Jessica Maat (Deactivated)

@Trevor Gokey

@Simon Boothroyd

@David Mobley

@Pavan Behara

Discussion

(missed taking notes on some discussion)

PB: Benchmarking the experimental fits doesn’t show a significant improvement over 1.3.0, currently looking at the ddE per parameter and molecules with larger ddE.

DLM: Yeah, a systematic way of analyzing it would be to look at the distribution of functional groups in both the small error and large error regions, also look at the overrepresented torsion parameters in molecules with large error

DLM&SB: To simplify the process we can do a couple of fits that shows proof of concept like we did for Parsley, this would give us a minimum viable product.

JM: I finished doing kval plots and interactive plots to look at the molecules, further I want to look at how the trends would be when we have more data. For this I will go beyond the 1.2 training and substituted phenyl and apply the same analysis for different sets. Another suggestion from group meeting is to generate new set of carefully designed molecules vis a vis Chaya’s substituted phenyl.

DLM&SB: Yeah, it looks like we need these deliberate designs to reduce noise in the training.

TG: I have a general question about the red points on JM’s plots, how many times is the torsion parameter applied?

JM&DLM: These points have only one torsion driven and the generic smarts term is applied four times. JM has filtered the data so that out of 1000+ points only a few that match to this smarts pattern are selected.

TG: Okay, I see the secondary y-axis has the QM derived Torsion barriers, so does bespoke fitting help in looking at the contribution of each k-value (red dot) to the total torsion barrier (corresponding blue triangles)?

DLM: Yeah, it may help. May be we can check with JH about that and we can do one or two molecules from these datasets and see what new information we get.

SB: Another thing we talked about in the last meeting about averaging out the wbo either with ELF10 or by using a diverse set of conformers. Current way of using the openff-tk provided values would calculate only for a single conformer and I think the error would be around 0.1 with the averaged out wbo so it may not affect the trends much but still would be nice to have it. I have a code snippet that I can pass on to JM to get this conformer averaged wbo.

 

Action Items

@Pavan Behara will try out a couple of simple fits with very few interpolated parameters that are expected to do well viz., replacing t43,44,45 with one single interpolated parameter, and other such instances.
@Pavan Behara will analyze the larger ddE molecules and checkout overrepresentation of torsion parameters that contribute to large errors.
@Jessica Maat (Deactivated) will propose chemical series to generate more deliberate datasets such as Chaya’s substituted phenyl.
@Jessica Maat (Deactivated) will expand current analysis to datasets other than 1.2.0 training and substituted phenyl to get a better trend.
@Jessica Maat (Deactivated) will touch base with Josh Horton to apply bespoke fitting to few molecules of interest from our current datasets to fit force constants.
@Jessica Maat (Deactivated) Reaching out to Chris Bayly & Chaya Stern about their thoughts on other chemical series that may be relevant to wbo interpolation