2020-12-03 OpenFF System/Espaloma Interface Meeting notes

Date

Dec 3, 2020

Participants

  • @Jeffrey Wagner

  • @Yuanqing Wang

  • @Matt Thompson

  • @Simon Boothroyd

Discussion topics

Item

Notes

Item

Notes

Background

  • YW – So we want to have a system where we keep track of not JUST the FF parameters, but also the ML model parameters.

  • (General) The outputs from Espaloma would be multiplied by a fully diagonal m matrix

  • (General) – Do we want to be able to:

    • USE espaloma-derived parameters in an OpenFF system?

    • TRAIN espaloma models using analytical derivatives OpenFF systems?

  • JC – Multiple tiers of goals. Could do different levels of integration on the “left” side of training (espaloma outputs → OpenFF systems) and “right” side of training (loss functions, getting analytical derivs from QM training, and possibly phys prop).

  • YW – Outputting calculated parameters from Espaloma and OpenFF system should be straightforward. But propagating derivatives back into espaloma for training could be done by making openff system act like “lambda module”. A lambda module is something that could be integrated into automated differentiation, but doesn’t contain trainable parameters. In the latter case, we should stop thinking about the OpenFF System in a programmatic-object sort of way and start thinking about it in an ML sort of way.

  • JC – Currently we see it as a sort of data container. Aspirationally we want to have ways to represent it as lambda modules or other ways to integrate it into training frameworks. In the case of timemachine, we could imagine composing them together and putting them into a training loop.

  • (General) – Short term, we want to support “frozen” outputs from Espaloma. Medium-term, we’ll want to think about having a fully differentiable representation that could link espaloma model weights in the OpenFF System

What ML packages we want to initially support (and what other packages we want to eventually support)

  • YW – Everything is in jax now, but it wouldn’t be super hard for me to convert to PyTorch or another framework. But the most general solution would be to use DGL, which is framework-agnostic.

  • YW – DGL is a unified syntax for using many different ML libraries. So instead of focusing on jax, we could do everything using DGL, and htereby inherently support a bunch of backends (like pytorch, etc)

    • YW is making a jax plugin for DGL

  • JC – How do we convert this into a differentiable function?

    • YW – graph.node.data.global_energy

  • DLPack is the interchange language for

What is the specification at the interface? (what gets produced by espaloma and ingested by the OpenFF System)

  • Short term: No special action needed, currently espaloma produces a OpenMM System, so just a bit of tinkering will get us what we need (“frozen” parameters).

  • Medium term: Integrating openff evaluator to get the trajectory data, and then having espaloma calculate the loss

  • Long term: Once there’s a fully differentiable MD engine, do end-to-end differentialtion

  • Long term: Standardize how differentiable representations that can use the System object as a lambda module/object

    •  

 

  • (General) – What is it that OpenFF System PROVIDES in the process of making a differentiable representation of loss as a funciton of input parameters/weights

    • JC Three things are needed to map from system energies to input parameters:

      • (this one related to system object) Parameter mapping function (kinda like a matrix)

        • (General) – Would current d/d_ff_parameters(loss, current_ff_params) object that is produced suffice?

          • (General) – We’ll need to talk to JF and YTZ to see if this works for them (could be used as input for timemachine/serves for their purposes)

      • Timemachine/differentiable MD

      • Way of ingesting things like torisondrives into targets

    • MT – Does espaloma have a notion of “FF parameters” / the linkage between multiple occurrences of the same numbers calculated the same way?

      • JC – Not really, it’s just a big vector of parameters.

    •  

  • What will the System object need to be able to RECEIVE from espaloma?

    • Initially, just frozen parameter values

    • Eventually, an entire DGL graph for each molecule

Action items

Decisions