...
keep a8 and a9, but replace the central atom with a wildcard so it can be any 4-membered ring atom.
Remove a8 and a9, and add two new parameters to the end: a44(
[*;!r4:1]~;!@[*;r4:2]~[*;!@[*r4:3]
) and a45 ([*;r4:1]@[*;r4:2]~;!@[*:3]
). I specified!r4
instead of!@
in a44 because specifiying!@
led to it not picking up fused rings. I left a45 with!@
because specifying!r4
led to it missing two attached (e.g. connected by a single, non-ring bond) 4-membered rings.
New parameter: Renamed a7 to a42, due to moving it to the end. In (2) above, added a44 and a45 for the respective SMIRKs patterns listed.
...
New parameter: Added a new parameter to the end called a43.
I’ve also looked into splitting a13
, as it currently covers both fused and spiro rings. Splitting them into two separate categories seems clear via the MSM parameters, so I added a13a
([*;r6:1]~;@[*;r5;x4:2]~;@[*;r5;x2:3]
), which separates out the spiro rings. However, the split is less clear using Espaloma, as there is a lot of variation even within fused or spiro rings that is not present in the MSM data.
New parameter: Added a new parameter after a13 called a13a.
Issue with fused rings (Lexie)
One issue I have noticed with separating the small ring parameters is that there is no way to specify in a SMARTS pattern that a given atom is in a ring of a given size. The primitive r
indicates the size of the smallest ring the atom is a part of, but if it is part of a fused or spiro ring, this may lead to issues. The primitive R
denotes that an atom is part of a ring, but can only be modified by the number of ring bonds, not the size of the ring.
...
After a lot of experimenting I haven’t been able to find a solution that involves a single elegant SMARTS pattern. To get these right, we may have to add a number of very specific parameters, and increase coverage for fused rings.
Other parameters I’ve looked at (Lexie)
...