Constraints on non-driven atoms in TorsionDrives
Research question
To generate 2-D TorsionDrives of protein backbone dihedrals in dipeptides (2021-11-18-OpenFF-Protein-Dipeptide-2D-TorsionDrive), the heavy-atom dihedral angles of the amino acid sidechains (which were not driven in the TorsionDrive) were constrained to values corresponding to highly populated sidechain rotamers. These constraints prevent large sidechain rearrangements that lead to discontinuous jumps in the QM energy profile (https://openforcefield.atlassian.net/wiki/spaces/MEET/pages/2210889730). There is concern that using CONstraints instead of REstraints may introduce artificially high energy barriers during the torsion scan.
Atomic gradients of constrained atoms
If constraints on non-driven dihedral angles introduce artificially high energy barriers, we would expect that the atoms in these constrained dihedrals would experience much larger forces after the QM optimizations at each grid point when these constraints are present. Below we compare the atomic gradients for the TRP-rotamer-2
record for a phi/psi 2-D TorsionDrive with constraints on sidechain dihedrals (OpenFF Protein Dipeptide 2D TorsionDrive v1.1) and without constraints on sidechain dihedrals (OpenFF Protein Dipeptide 2D TorsionDrive v1.0).
The metric for comparison is the root mean square (RMS) force over all grid points in the TorsionDrive and over subsets of atoms.
The categories of atom subsets are all atoms, driven atoms (atoms in phi or psi), non-driven atoms (atoms in chi1 or chi2), non-driven and not driven atoms (atoms in chi1 or chi2 but not n phi or psi), and non-constrained atoms (atoms not in phi, psi, chi1 or chi2). The units for RMS force are (millihartree per bohr).
Atom category | N Atoms | RMS Force, no chi constraints | RMS Force, with chi constraints |
---|---|---|---|
All | 36 | 2.422 | 2.870 |
Driven | 5 | 6.497 | 7.108 |
Non-driven | 5 | 4.982 | 6.259 |
Non-driven and not driven | 3 | 0.083 | 3.822 |
Non-constrained | 28 | 0.043 | 0.041 |
The RMS force on non-constrained atoms is similar with and without constraints on sidechain dihedrals, and the RMS force on non-driven and not driven constrained atoms is 50 times higher with constraints on sidechain dihedrals than without.
Energy differences between TorsionDrives
We can also compare the QM energy profiles with and without constraints on sidechain dihedrals. First, we shift each energy profile by its own minimum energy so that the minimum energy is zero for both profiles. The minimum energy occurs at the same grid point for both profiles. Then, we compute an RMS error over grid points.
The RMS error is 2.97 kcal/mol−1. However, in a ForceBalance fit, grid points are weighted as a function of their shifted QM energies in kcal mol−1.
We can thus compute a weighted RMS error.
With the weights defined using the shifted QM energies of the TorsionDrive without constraints on sidechain dihedrals, the weighted RMS error is 0.0779 kcal mol−1. Thus, the energy differences occur almost entirely in high energy regions which do not contribute to the ForceBalance objective function during torsion fitting.
Code to reproduce
The results in the table above can be reproduced with the following python code. Arguments are
python get-opt-grads.py -d "OpenFF Protein Dipeptide 2-D TorsionDrive v1.0" -r "TRP-rotamer-2" -c "[6, 8, 10, 13, 14]"
and
python get-opt-grads.py -d "OpenFF Protein Dipeptide 2-D TorsionDrive v1.1" -r "TRP-rotamer-2"
import click
import json
import numpy
import os
from qcportal import FractalClient
@click.command()
@click.option(
"-c",
"--constrained",
"constrained_indices",
default="[6, 8, 10, 13, 14]",
show_default=True,
type=click.STRING,
help="The list of indices of unique atoms in non-driven constrained torsions.",
)
@click.option(
"-d",
"--dataset",
"dataset_name",
default="OpenFF Protein Dipeptide 2-D TorsionDrive v1.0",
show_default=True,
type=click.STRING,
help="The name of the QCArchive TorsionDrive collection to be interrogated.",
)
@click.option(
"-r",
"--record",
"record_index",
default="TRP-rotamer-1",
show_default=True,
type=click.STRING,
help="The name of the record index in the TorsionDrive collection.",
)
def main(constrained_indices, dataset_name, record_index):
client = FractalClient()
dataset = client.get_collection("TorsionDriveDataset", dataset_name)
query = dataset.query("default")
td_record = query[record_index]
opt_results = td_record.get_final_results()
grad = {
grid_id: opt_results[grid_id].dict()["return_result"]
for grid_id in opt_results
}
# Square of magnitude of force on each atom by grid point
sq_forces = numpy.array([
numpy.sum(numpy.square(grad[grid_id]), axis=1)
for grid_id in grad
])
# Mean over grid points of square mangitude of force on each atom
mean_sq_forces = numpy.mean(sq_forces, axis = 0)
# Root mean square force of all atoms
all_rms_force = numpy.sqrt(numpy.mean(mean_sq_forces))
# Root mean square force of atoms in driven torsions
driven_indices = numpy.unique(td_record.keywords.dihedrals)
driven_rms_force = numpy.sqrt(numpy.mean(mean_sq_forces[driven_indices]))
# Root mean square force of atoms in constrained non-driven torsions
if "constraints" in td_record.keywords.additional_keywords:
td_constraints = td_record.keywords.additional_keywords["constraints"]
non_driven_indices = numpy.unique([
constraint["indices"] for constraint in td_constraints["set"]
])
else:
non_driven_indices = numpy.array(json.loads(constrained_indices))
non_driven_rms_force = numpy.sqrt(
numpy.mean(mean_sq_forces[non_driven_indices])
)
# Root mean square force of atoms in non-constrained torsions
N_atoms = mean_sq_forces.size
constrained_indices = numpy.unique([driven_indices, non_driven_indices])
non_constrained_indices = [
i for i in range(N_atoms) if i not in constrained_indices
]
non_constrained_rms_force = numpy.sqrt(
numpy.mean(mean_sq_forces[non_constrained_indices])
)
out_str = "\n{:15s} {:2d} {:8.6f}"
print("# Category N_atoms RMS_force",
out_str.format("All", N_atoms, all_rms_force),
out_str.format("Driven", len(driven_indices), driven_rms_force),
out_str.format(
"Non-driven", len(non_driven_indices), non_driven_rms_force
),
out_str.format(
"Non-constrained", len(non_constrained_indices),
non_constrained_rms_force
),
)
if __name__ == "__main__":
main()