2021-01-22 WBO/Impropers meeting notes

Participants

@Jessica Maat (Deactivated)

@Simon Boothroyd

@Trevor Gokey

@David Mobley

@Pavan Behara

Discussion

PB: I did another fit starting from fit4 and removing the TIG1a, 1b, which are filtering the in-ring torsions, and making TIG0 an interpolated parameter instead of a general torsion parameter. The objective function values are here. Each fit (fits 4 and 7) is the better performing one on its training set so this is not a good value to compare, I am running benchmarks on a small set for these fits now.

For the full refit of all parameters I have two experimental fits that are sort of on two extremes, one with only a single interpolated parameter (simple_fit1), and the other with lot of interpolated parameters, which one to choose?

DLM: Okay, for the refit I mean only do the valence parameters and not LJ parameters. Depending on how fast you can get through we can do more iterations. And yeah, it is better to check the performance on a benchmark set.

SB: I would add that you can check the performance only for a specific series like amides, whether these interpolated parameters are able to solve the cis/trans issue or others.

DLM: Yeah, Chaya’s series would be a starter, you can remove the datapoints used in fitting and check it. Amides is also a good one.

SB: There was some issue with the non-interpolated parameters before did you sort it out?

PB: Yeah, I had very high obj. fn value with the non-interpolated fit and that’s an error due to a wrong phase in one of the parameters, I rectified it and now the obj. fn value for fit4.1(non-interpolated) is in between fit4(interpolated) and 1.3.0, as expected.

PB: So, if I go for a complete refit of valence parameters how to keep the parameters close to 1.3.0?

DLM/SB/TG: You can specify priors for each of the parameters. Actually, priors affect the final physical parameter values and it might be a reason for slightly degraded performance wrt openff-1.3.0. Before going to a full refit of all parameters check the fits by adjusting the priors, a value of 2 you are using right now might be too low for describing the in-ring torsions and that is why those TIG1a, 1b that have a high barrier might not be getting higher values resulting in high error. Check in with HJ/LPW on how they choose priors. Also, you can turnoff the priors and another fit and let the parameters fly.

SB: What are the goal posts we are aiming at for this wbo work in general?

DLM: We want an interpolated FF that has either

same number of parameters but performs well
fewer parameters and performs comparably well or better in may be two of the metrics (TFD, ddE, RMSD)
another case might be a FF with interpolated parameters that describes new chemistries

PB: Here are some results on molecules with larger ddE, TIGs 1a, 1b , 3 are prominent in parameter usage.

DLM: By over-representation I mean check the overall distribution of a parameter on the whole dataset and on this subset of higher ddE molecules, not just counts on this subset.

PB: Okay, I will update it.

JM: I am working on carefully designed datasets that will help with interpolated torsion parameter definition, added few more functional groups after suggestions from ff-release call.

DLM: Okay, sounds good. Do you know the current status of dataset submission pipeline?

TG: It is pretty much empty, this is a good time to submit those sets, there are some example notebooks on qca-dataset submission, I will share more details on slack.

JM: Is there a way to select a particular torsion to drive? I need only some specific torsions for each of these subsets/functional groups I am interested in, how does one go about it?

TG: There are ways to do that but I agree it is the most challenging part, after that is taken care rest of the submission pipeline is smooth and automatic with qcsubmit.

JM: Another thing I want to discuss is about xtb calculations, the energy scales are different when I want to compare to AM1BCC calcs

DLM: Yeah, a single energy doesn’t provide anything to us, generate a few conformers and set any one of them as reference and get the difference in energy with respect to that conformer, it may or may not be the minimum energy conformer, you can pick any one of them.

JM: Okay, sounds good, will pass on to the undergrad.

@Pavan Behara will refit fit7,simple_fit1 with different priors, get inputs from HJ/LPW

@Pavan Behara will check the performance of fit7 vs fit4 on a small BM set, and extend it to larger set

@Pavan Behara will update larger ddE analysis

@Jessica Maat (Deactivated) will work on preparing the new datasets for submission