2021-03-19 WBO/Impropers meeting notes

Participants

@Christopher Bayly

@David Mobley

@Simon Boothroyd

@Trevor Gokey

@Jessica Maat (Deactivated)

@Pavan Behara

 

Disclaimer: Little bit incoherent in places, correct them if you can. Sorry for the late posting, wrote the notes but didn’t click on publish.

Discussion

JM: I made new scripts to plot the impropers.

CB: I see there are two chiralities indicated.
DLM: I think they’re output angle indices. So, JM’s question is what does this mean for the plots?
In here when you plot one particular improper that you are driving, you can imagine taking those coordinates and feeding into this and remake five extra horizontal axis somehow we wanted look at the three that have the same handedness

CB: How we want to visualize the data is how we want to do science. Parameter for the impropers, is it being applied to the six from the recipe of four atoms? what the parameter is giving us four atoms smarts, we want only the three wih the same chirality
DLM: Yes
CB: So, by applying to all three I want to look at the average of the three
What do we want to visualize and what best represents that

DLM: I think it is illegal to average, we want all three on the same graph
Recompute the other two angles
JM: So, the energies are same but angles are different for the other two
SB: I agree we can't average

CB: All the lines will converge at zero and 180,

TG: Use arctan2 instead of arccos like qcengine?

DLM: That might be good for analyzing QM data, may not be good form the perspective of FF, it applies the three that has the same handedness. In the plots JM is showing angles
133, 140, 119

JM: One other way might be to look at the some of the valence angles at the central atom, how flat it is.

CB: We re trying to figure out our parameters in the end, I think JM is right about looking at valence angles for planarity, I honestly don't know the most helpful way of looking at it, that will help us to understand the implications of a single parameter with 4 atoms smarts, the three chiralities that are over than 100 with indices 0,2,4 would they give same energies as other angles 1,3,5?

DLM: In implementing this I went over that.

CB: When we look at that all our parameters are aiming at planarity.

DLM: Yeah, that's the way it works.

JM: I will try plotting with three impropers and energy profiles. The 1d torsion profiles in order of pyramidal to planar, working on the visualization

DLM: Looking at this 1d scan with a sudden jump in the middle, something is bumping something and something gets unstuck, how many you have in total?
JM: around 200 of these 1d scan plots

CB: This nitrogen doesn't want to be planar, it has a maximum at zero and two minima on either side, it wants to be symmetric and non-planar
I ask myself, how pyramidal or in other words what's is the hybridization of the Nitrogen is it sp2 or sp3 or sp2.6 or sp2.8
I need a FF that represents this improper properly, there is a dihedral angle that controls this and valence and an improper. If I just fit valence, if I just fit improper or torsion how does this change?
What i am really interested in, consider the sp2 C - sp2 C and when I drive the angle the pyramidalization is going to change??

This is like a slice of the multidimensional potential energy landscape possible with all these three valence, improper, and proper torsions.

Imagine pyrimidine as I move the hydrogens, in summary then it's a partially pyramidalized nitrogen, and how do we embody that in the three different terms? I am certain that it is a combination of all three of them even in bespoke way, how do we translate into parameters as general as possible? These are the questions we should focus on.

DLM: What JM did before, if we capture these impropers and simultaneously optimize valence

how do we avoid getting lost in optimizing three parameters and making a decision

CB: I think I really understand the nature of the problem here, the way to tackle chem per is we are binning things together and we have to figure out where to draw the line and how to bin these together and the conjugation of the nitrogen that's what is leading to wiberg bond order,
it is the conjugation that controls the nitrogen
the two nitrogen far away pyrimidine
and place a nitrogen in the para they would be planar
we expect molecules that have the same hybridization in one bin then we need to go with the smarts definition of the bin

DLM: Trends in improper angle versus wiberg bond order
Did we already ask what if we bin those into flat, tetrahedral and predict

CB: Yes in away, as i have been talking about pyramidalization of Nitrogen and conjugation, as i make my case I assert that valence and torsion also contribute, so it depends on how much the nitrogen is in plane or not, whatever choice you made with valence and improper you couldn't capture the behavior when you rotate amine, the only way is through dihedral, the moment you lump them all
then we can't look at single slice of data like this, dihedrals of the nitrogens each of them independently, there would be sensible chemistry which we meant to characterize.
JM already found out that improper and valence are correlated but we haven't yet found improper and dihedral are correlated

we should structure our science, then bin the things, then it comes to chemper, that's how I see it

DLM: I am struggling little bit with it, this involves investing a lot more time in understanding the details, in the long term it will pay out but in the short term while generating or planning to generate in short term we need to focus on a niche area.

CB: So, this is aryl amino group general category group,
so if we are trying to understand we already have lot data in our database. What if we use some tool to group together similar pyramidalizations of those nitrogens and fit parameters to those together. Just get the data and characetize.

DLM: Pull all amino aryls catergorize into three planar to pyramidal, come up with chemical perception to behve that way

CB: The improper angle or the torsional protential we are looking at might be one of the three, some of your other cases which showed asymmetry there we have to understand whats the behavior wrt improper, somehow understand the three improper in opposite to picking up one single

DLM: Main action item is to analyze in other way
and in general split up the molecules into flat, intermediate, tetrahedral and pull out the amino aryls and do the binning, JM can you do that?

JM: Okay, I guess split the profiles based on their geometry and also look more specifically at the amino aryl groups

CB: Also track the wbo between amino nitrogen and carbon, and correlate with the pyramidalization

exclude anything with ortho nitrogen on aryl

ortho nitrogen introduces very strong electrostatics forcing all those nitrogens to be very planar

lets look at these other behaviors

these means on your valence angle definitions, pyrene not nitrogen
to exclude ortho nitrogen in your valence angle terms

JM: Ortho nitrogen as well as oxygen??

CB: Just avoid the ortho substituents.

SB: So, I definitely agree to remove any noise if we want to be targeting, may be design a dataset in the first hand with all the filtering, generate a clean data that absolutely show the trends we want to see, that's not a waste of time either, any data we generate would go into a fit.

CB: But it still requires generating data
SB: Even if we generate 30-40 torsiondrives, should be good to go

CB: Okay, I can think of

amino-6-aryl
amino-5-aryl

C1 is the NH2 bonded carbon
C2 is CH

C3,4,5 are C-R, or a nitrogen

somebody came up with the Hammett electron withdrawing groups
four substituents other than hydrogen
strongly electron withdrawing,
weakly electron withdrawing
weakly strongly donating
they can go on carbons 3,4,5
carbon 6 in six membered ring is CH again

and on the nitrogen we will have H, H2...

SB: That should generate a good dataset.

 

PB: TG suggested that comparison with openff-1.3.0 is not fair since it is not trained on substituted phenyl set, I did some fits where I refit the general parameters matching 1.3.0, and here are some plots that show a comparison with the OpenFF-1.3.0 trained on substituted phenyl set, it looks like the disparity is brought down in 6 out of 8 cases and only 2 of them show an improvement with interpolated parameters.

Another experiment is introducing only TIG5 smarts as a new general versus new interpolated parameter in 1.3.0. This shows interpolated parameter is doing good.

DLM/CB: Okay, these look interesting. Sync up with SB on the blockers for inclusion in a general FF.