ForceBalance + NMR observables fitting

Participants

Goals

See if openff-forcebalance can be used for fitting torsions to NMR observables
- Will a new target need to be developed?
  - Related: this fit would not also include optimizations or torsion drives (but in the future a big fit might)

Discussion topics

Item	Notes

Item

Notes

I want to tune a subset of torsion parameters already fit using our standard workflow to NMR observables for short peptides. So I would use priors to keep the parameters close to the values from the QM fit and wouldn't need to fit to QC optimized geometries or TorsionDrive targets during this optimization. The observables are simple differentiable functions of time averages of dihedral angles, and I already have python scripts to compute these observables (but not gradients) from trajectories. The trajectories take ~24 h on a 3090 for a single system, and it's not decided yet how many systems I would use for fitting vs validation.

Compute per optimization iteration:

~24h of simulation, with multiple proteins and multiple metrics per protein
- Each protein can be run in parallel (i.e. different GPUs)
- Each protein provides multiple metrics (from the same trajectories)

For first iteration, get a gradient, take a step in parameter spaced based on that gradient.

Can we get gradients of these observables from one set of simulations?
- Evaluator has an analogous solution for physical property simulations
  - Does it just perturb initial and run two
- Yes, as long as you can extract dihedral angles from simulations. Should be feasible.
- phi = dihedral angle
- A, B, C are complex things derived from DFT, which are fixed in these simulations.
- the gradient of interest is dJ/dk
  - k is a vector (many torsions, many force constants)
- The observable is a scalar coupling J(phi) = A cos^2(phi) + B cos(phi) + C
- Error function is \sum_{observables} (<J_calc(phi)> - J_exp)^2 / sigma_exp^2
- Gradient of d<J_calc(phi)>/dk from Eqs. 1 and 2 from Evaluator paper: Open Force Field Evaluator: An Automated, Efficient, and Scalable Framework for the Estimation of Physical Properties from Molecular Simulation
Can we shuffle this around to get loss as a function of k instead of phi? Relationship between phi and k is intuitively simple enough but not analytically simple

For prototyping, with only this NMR target, don’t strictly need ForceBalance to optimize things. Evaluator or something will get the gradients (dJ/dk) and can use a more minimal SciPy optimizer off the shelf.
- Later on, ForceBalance could provide value in doing a grander fit with more targets at the same time

What’s the smallest system that could be used for prototyping this J gradient?
- Chapin has been benchmarking a 3-mer with 500 ns simulations. Could probably get okay (not great) data at around 50 ns of data (maybe even less?) which is a couple of hours on a 3090. Probably noisy data but might be able to get the sign of the gradient correct. Could even do implicit solvent to cut more corners in debugging

Scientific question: can we get away with shorter simulations (<< 24h instead of ~24h) for NMR observables during fitting?

Action items

@Matt Thompson Learn more about Evaluator - ForceBalance interface to understand how it uses physical property simulations in its existing finite different algorithm.

@Chapin Cavender will check in around June 26 to sync up on this

ForceBalance + NMR observables fitting

Participants

Goals

Discussion topics

Action items

Decisions