Constraints on non-driven atoms in TorsionDrives

Research question

To generate 2-D TorsionDrives of protein backbone dihedrals in dipeptides (2021-11-18-OpenFF-Protein-Dipeptide-2D-TorsionDrive), the heavy-atom dihedral angles of the amino acid sidechains (which were not driven in the TorsionDrive) were constrained to values corresponding to highly populated sidechain rotamers. These constraints prevent large sidechain rearrangements that lead to discontinuous jumps in the QM energy profile (https://openforcefield.atlassian.net/wiki/spaces/MEET/pages/2210889730). There is concern that using CONstraints instead of REstraints may introduce artificially high energy barriers during the torsion scan.

Atomic gradients of constrained atoms

If constraints on non-driven dihedral angles introduce artificially high energy barriers, we would expect that the atoms in these constrained dihedrals would experience much larger forces after the QM optimizations at each grid point when these constraints are present. Below we compare the atomic gradients for the TRP-rotamer-2 record for a phi/psi 2-D TorsionDrive with constraints on sidechain dihedrals (OpenFF Protein Dipeptide 2D TorsionDrive v1.1) and without constraints on sidechain dihedrals (OpenFF Protein Dipeptide 2D TorsionDrive v1.0).

The metric for comparison is the root mean square (RMS) force over all grid points in the TorsionDrive and over subsets of atoms.

Error rendering macro: No valid license found for LaTeX Math addon

The categories of atom subsets are all atoms, driven atoms (atoms in phi or psi), non-driven atoms (atoms in chi1 or chi2), non-driven and not driven atoms (atoms in chi1 or chi2 but not n phi or psi), and non-constrained atoms (atoms not in phi, psi, chi1 or chi2). The units for RMS force are

Error rendering macro: No valid license found for LaTeX Math addon
(millihartree per bohr).

Atom category

N Atoms

RMS Force, no chi constraints

RMS Force, with chi constraints

Atom category

N Atoms

RMS Force, no chi constraints

RMS Force, with chi constraints

All

36

2.422

2.870

Driven

5

6.497

7.108

Non-driven

5

4.982

6.259

Non-driven and not driven

3

0.083

3.822

Non-constrained

28

0.043

0.041

The RMS force on non-constrained atoms is similar with and without constraints on sidechain dihedrals, and the RMS force on non-driven and not driven constrained atoms is 50 times higher with constraints on sidechain dihedrals than without.

Energy differences between TorsionDrives

We can also compare the QM energy profiles with and without constraints on sidechain dihedrals. First, we shift each energy profile by its own minimum energy so that the minimum energy is zero for both profiles. The minimum energy occurs at the same grid point for both profiles. Then, we compute an RMS error over grid points.

Error rendering macro: No valid license found for LaTeX Math addon

The RMS error is 2.97 kcal/mol−1. However, in a ForceBalance fit, grid points are weighted as a function of their shifted QM energies in kcal mol−1.

Error rendering macro: No valid license found for LaTeX Math addon

We can thus compute a weighted RMS error.

Error rendering macro: No valid license found for LaTeX Math addon

With the weights defined using the shifted QM energies of the TorsionDrive without constraints on sidechain dihedrals, the weighted RMS error is 0.0779 kcal mol−1. Thus, the energy differences occur almost entirely in high energy regions which do not contribute to the ForceBalance objective function during torsion fitting.

Code to reproduce

The results in the table above can be reproduced with the following python code. Arguments are

python get-opt-grads.py -d "OpenFF Protein Dipeptide 2-D TorsionDrive v1.0" -r "TRP-rotamer-2" -c "[6, 8, 10, 13, 14]"

and

python get-opt-grads.py -d "OpenFF Protein Dipeptide 2-D TorsionDrive v1.1" -r "TRP-rotamer-2"

import click import json import numpy import os from qcportal import FractalClient @click.command() @click.option( "-c", "--constrained", "constrained_indices", default="[6, 8, 10, 13, 14]", show_default=True, type=click.STRING, help="The list of indices of unique atoms in non-driven constrained torsions.", ) @click.option( "-d", "--dataset", "dataset_name", default="OpenFF Protein Dipeptide 2-D TorsionDrive v1.0", show_default=True, type=click.STRING, help="The name of the QCArchive TorsionDrive collection to be interrogated.", ) @click.option( "-r", "--record", "record_index", default="TRP-rotamer-1", show_default=True, type=click.STRING, help="The name of the record index in the TorsionDrive collection.", ) def main(constrained_indices, dataset_name, record_index): client = FractalClient() dataset = client.get_collection("TorsionDriveDataset", dataset_name) query = dataset.query("default") td_record = query[record_index] opt_results = td_record.get_final_results() grad = { grid_id: opt_results[grid_id].dict()["return_result"] for grid_id in opt_results } # Square of magnitude of force on each atom by grid point sq_forces = numpy.array([ numpy.sum(numpy.square(grad[grid_id]), axis=1) for grid_id in grad ]) # Mean over grid points of square mangitude of force on each atom mean_sq_forces = numpy.mean(sq_forces, axis = 0) # Root mean square force of all atoms all_rms_force = numpy.sqrt(numpy.mean(mean_sq_forces)) # Root mean square force of atoms in driven torsions driven_indices = numpy.unique(td_record.keywords.dihedrals) driven_rms_force = numpy.sqrt(numpy.mean(mean_sq_forces[driven_indices])) # Root mean square force of atoms in constrained non-driven torsions if "constraints" in td_record.keywords.additional_keywords: td_constraints = td_record.keywords.additional_keywords["constraints"] non_driven_indices = numpy.unique([ constraint["indices"] for constraint in td_constraints["set"] ]) else: non_driven_indices = numpy.array(json.loads(constrained_indices)) non_driven_rms_force = numpy.sqrt( numpy.mean(mean_sq_forces[non_driven_indices]) ) # Root mean square force of atoms in non-constrained torsions N_atoms = mean_sq_forces.size constrained_indices = numpy.unique([driven_indices, non_driven_indices]) non_constrained_indices = [ i for i in range(N_atoms) if i not in constrained_indices ] non_constrained_rms_force = numpy.sqrt( numpy.mean(mean_sq_forces[non_constrained_indices]) ) out_str = "\n{:15s} {:2d} {:8.6f}" print("# Category N_atoms RMS_force", out_str.format("All", N_atoms, all_rms_force), out_str.format("Driven", len(driven_indices), driven_rms_force), out_str.format( "Non-driven", len(non_driven_indices), non_driven_rms_force ), out_str.format( "Non-constrained", len(non_constrained_indices), non_constrained_rms_force ), ) if __name__ == "__main__": main()