...
The second round of experiments is aimed at distinguishing between H and non-H atoms.
Both force fields have the following modifications (in addition to those described in iteration 1):
New parameter: a4a: [*;r3:1]~;@[*;r3:2]~;!@[#1:3]
r3 atom - r3 atom - H
New parameter: a6a: [#1:1]-[*;r3:2]~;!@[*:3]
H - r3 atom - H
a13a: [*;r6:1]~;@[*;r5;x4:2]~;@[*;r5;x2:3]
-->
[*;r6:1]~;@[*;r5;x4,*;r5;X4:2]~;@[*;r5;x2:3]
New parameter: a14a:[#1:1]~!@[*;X3;r5:2]~;@[*;r5:3]
Version 1 has expanded a8 and a9 to distinguish between nonring-ring-nonring vs ring-ring-nonring as well as distinguish more between H/nonH:
New parameter: a8a: [*;r4:1]@[*;r4:2]-;!@[!#1:3]
New parameter: a9a: [*;r4:1]@[*;r4:2]-;!@[#1:3]
New parameter: a9b: [#1:1]-[*;r4:2]-;!@[#1:3]
Version 2 has expanded a44 and a45 to distinguish between H/nonH:
New parameter: a44a: [#1:1]~[*;r4:2]~[#1:3]
New parameter: a45a: [*;r4:1]@[*;r4:2]~;!@[#1:3]
3-membered rings
First iteration of experiments
...
Currently we don’t have any internal r5-r5-r5 ring angles, so I made one. I just made a generic one: [*;r5:1]@[*;r5:2]@[*;r5:3]
but we may want to break it down further. Looking at the MSM parameter distribution, it seemed like the non-aromatic rings were clustered together, but the aromatic rings were all over the place in a way that made it not obvious how to split them.
New parameter: Added a new parameter to the end called a41.
...
I’ve also looked into splitting a13
, as it currently covers both fused and spiro rings. Splitting them into two separate categories seems clear via the MSM parameters, so I added a13a
([*;r6:1]~;@[*;r5;x4:2]~;@[*;r5;x2:3]
), which separates out the spiro rings. However, the split is less clear using Espaloma, as there is a lot of variation even within fused or spiro rings that is not present in the MSM data.
...
TO DO: The remaining blue dots in the top left corner are all molecules with a fused ring where the angle is all single bonds, whereas the bottom right corner all have a double bond somewhere in the molecule. I haven’t been able to come up with a good SMIRKs to fit the top left cluster together.
...
New parameter: Added a new parameter after a13 called a13a.
Chris Bayly suggested looking into the ring-ring-nonring and nonring-ring-nonring parameters for 5-membered rings as well. I took a look and they didn’t look too different from the distributions they were a part of.
These external 5-member ring angles appear in almost every angle parameter distribution and usually aren’t very distinct. I think it would require a lot of care to separate them, as there is a lot of diversity currently being captured by the different parameters assigned to the angles, and I don’t want to lose that by lumping them together. Left for later.
Second iteration of experiments
Found a SMIRKs that captures all the 13a molecules: [*;r6:1]~;@[*;r5;x4,*;r5;X4:2]~;@[*;r5;x2:3]
...
a14 [*:1]~!@[*;X3;r5:2]~;@[*;r5:3]
treats r5-r5-nonring--split into H vs nonH by introducing a14a:[#1:1]~!@[*;X3;r5:2]~;@[*;r5:3]
...
Issue with fused rings
One issue I have noticed with separating the small ring parameters is that there is no way to specify in a SMARTS pattern that a given atom is in a ring of a given size. The primitive r
indicates the size of the smallest ring the atom is a part of, but if it is part of a fused or spiro ring, this may lead to issues. The primitive R
denotes that an atom is part of a ring, but can only be modified by the number of ring bonds, not the size of the ring.
...