...
Which are both too specific (central atom must be C) and too broad (first atom could be in-ring or out of ring). Need to figure out how to work these together for the right coverage.
Looking at the distributions for a8 and a9 below, it’s not clear whether specifying ring-ring-nonring and nonring-ring-nonring separately will make a difference.
...
For now, I am trying two approaches:
...
Add new parameters
a44a: [#1:1]~[*;r4:2]~[#1:3]
anda45a: [*;r4:1]@[*;r4:2]~;!@[#1:3]
to split out H vs non-H parameters
...
5-member rings
First iteration of experiments
Currently we don’t have any internal r5-r5-r5 ring angles, so I made one. I just made a generic one: [*;r5:1]@[*;r5:2]@[*;r5:3]
but we may want to break it down further. Looking at the MSM parameter distribution, it seemed like the non-aromatic rings were clustered together, but the aromatic rings were all over the place in a way that made it not obvious how to split them.
New parameter: Added a new parameter to the end called a41.
Additionally, five-membered rings with S typically have a 90-degree angle around the S, rather than ~105 for other atoms. As a result I added a new parameter a43a with the pattern [*;r5:1]@[#16;r5:2]@[*;r5:3]
.
New parameter: Added a parameter a41a after a41.
I’ve also looked into splitting a13
, as it currently covers both fused and spiro rings. Splitting them into two separate categories seems clear via the MSM parameters, so I added a13a
([*;r6:1]~;@[*;r5;x4:2]~;@[*;r5;x2:3]
), which separates out the spiro rings. However, the split is less clear using Espaloma, as there is a lot of variation even within fused or spiro rings that is not present in the MSM data.
...
New parameter: Added a new parameter after a13 called a13a.
Additionally, five-membered rings with S typically have a 90-degree angle around the S, rather than ~105 for other atoms. As a result I added a new parameter a43a with the pattern [*;r5:1]@[#16;r5:2]@[*;r5:3]
TO DO: The remaining blue dots in the top left corner are all molecules with a fused ring where the angle is all single bonds, whereas the bottom right corner all have a double bond somewhere in the molecule. I haven’t been able to come up with a good SMIRKs to fit the top left cluster together.
New parameter: Added a new parameter a41a after a41a13 called a13a.
Issue with fused rings
One issue I have noticed with separating the small ring parameters is that there is no way to specify in a SMARTS pattern that a given atom is in a ring of a given size. The primitive r
indicates the size of the smallest ring the atom is a part of, but if it is part of a fused or spiro ring, this may lead to issues. The primitive R
denotes that an atom is part of a ring, but can only be modified by the number of ring bonds, not the size of the ring.
...
After a lot of experimenting I haven’t been able to find a solution that involves a single elegant SMARTS pattern. To get these right, we may have to add a number of very specific parameters, and increase coverage for fused rings.
Results
First iteration of experiments
Benchmarks for both versions of the Small ring FF are shown below. For DDE, Small ring v1 improves performance over Sage, and this improvement persists regardless of whether or not small rings are present in the benchmark set. This suggests that appropriately treating the small ring parameters leads to improvement in other parameters, that perhaps were pulled in an non-optimal direction to overcompensate for the incorrectly treated small rings. RMSD and TFD performance slightly improves over Sage, or stays the same.
I believe the worse performance of Small Ring v2 is due to grouping together H and non-H angles in a44 and a45, which are treated separately in Small Ring v1.
...
Other parameters I’ve looked at
...