Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We can verify this by inspecting the training dataset to see which molecules need t49 (results are shown in this notebook

View file
nameRedundant parameters.pdf
). Visualizing the molecules that use t49 shows that it applies to aromatic rings with a charged N, for example:

...

Therefore, ost aromatic N atoms are treated by t84, but t49 is not redundant, as it applies to aromatic charged N.

Code Block
languagepy
from openff.toolkit import Molecule, ForceField, Topology
forcefield = ForceField("openff-2.1.0.offxml")
import tqdm
from qcportal.client import FractalClient
from openff.qcsubmit.results import TorsionDriveResultCollection

# Create a client which allows us to connect to the main QCArchive server.
qcarchive_client = FractalClient()

td_result_collection = TorsionDriveResultCollection.parse_file(
    "sage-2.1.0/inputs-and-outputs/data-sets/td-set-for-fitting-2.1.0.json"
)

records_and_molecules = td_result_collection.to_records()

use_t49 = [] # list of molecules that use t49

for _, molecule in tqdm.tqdm(records_and_molecules, desc="checking"):
    all_labels = forcefield.label_molecules(molecule.to_topology())
    for mol_idx, mol_forces in enumerate(all_labels):
        for force_tag, force_dict in mol_forces.items():
            for atom_indices, parameter in force_dict.items():
                atomstr = ""
                for idx in atom_indices:
                    atomstr += "%3s" % idx
                # for some reason this adds each molecule 4 times for each appearance of t49
                if parameter.id == 't49': 
                    use_t49.append(molecule.to_smiles())

# Create unique list of molecules that use t49
set_list = set(use_t49)
use_t49_unique = list(set_list)

# Visualize the molecules
Molecule.from_smiles(use_t49_unique[0])
Molecule.from_smiles(use_t49_unique[1])
Molecule.from_smiles(use_t49_unique[2])

...

And shown with their SMILES here:

View file
namereport-1.pdf
.

The lingering questions are question is whether or not the SMARTS pattern restricts the torsion to aromatic rings of this form, where the large force constant might be reasonable, or if it applies anywhere else, and whether the large force constant is justified by the torsion potential energy surface.

PESs

To answer the second of thesethis, below are the molecules for which the TorsionDriveRecord dihedrals match the SMIRKS for t129 in the training data set:

...

From these data the force constant value of -20 kcal/mol looks reasonable and could even be a bit low for these cases.

Industry data

The final question then is whether this pattern can only apply to aromatic rings like these. The pattern covers 295 molecules in the industry benchmarking data set, as shown in the series of images below.

...

At least a few of these, shown below, are not aromatic.

...

TODO: is this force constant reasonable even for non-aromatic rings?

...