Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Pings for feedback David Mobley Michael Shirts, soliciting general comments Welcoming feedback, specific or along the lines of

  • How does this align with our (internal) objectives?

  • What should be described more thoroughly/in more detail?

  • Is there anything specified here that is pointing down an un-fruitful/headache-filled path?

  • Searching for ? should point you to clear decision points I need feedback on

...

Aims

The object of the OpenFF System is to enable the use of the SMIRNOFF specification in molecular simulation engines with minimal reliance on external converters and third-party libraries. This will enable researchers to implement force fields developed by the Open Force Field Initiative in their simulation workflows (with a few lines of Python code and/or as part of a CLI) as part of parametrized systems that can be sent to molecular simulation engines.

The internal representation is designed to enable evaluations of the potential energy of a configuration of atoms as described by the SMIRNOFF specification. No sole engine is designated as a target to carry out these calculations, preventing limitations arising from such an assumption. Much of the internals of the System class will be constructed on top of existing infrastructure: in particular, the Open Force Field Toolkit already has a mature ForceField class that manages force field parameters and Topology class that describes the cheminformatics molecular topology. These components will be heavily inspectable by the user; allowing, for example, the source of individual force field parameters, tagged with units, to be inspectable from within the parametrized system.

Simple Usage

This snippet demonstrates how an SMIRNOFF force field and molecular topology (ForceField and Topology classes in the OpenFF Toolkit, respectively) can be used to populate an OpenFF System.

Code Block
from system import System 
from openforcefield.topology import Molecule, Topology
from openforcefield.typing.engines.smirnoff import ForceField


# Load Parsley and populate a dummy topology (ignoring positions for the moment)
openff_forcefield = ForceField('openff-1.1.0.offxml')
openff_topology = Topology.from_molecules(10 * [Molecule.from_smiles('CCO')])

# Construct an OpenFF System with the force field and topology
openff_system = system.System(openff_topology, openff_forcefield)

openff_system.to_file('ethanol.top')
openff_system.to_file('ethanol.gro')

(I know this API is different than ForceField.create_openmm_system.doc(…) but it seems intractable to me for the toolkit to depend on the system object in the same way that it depends on OpenMM since the system will likely contain or construct from the toolkit’s topology and force field. This would, unfortunately, mean that the actual SMARTS-based parametrization maybe would need to be duplicated internally here. I would like to avoid a dependency loop in which they depend on each other.)

Features

Things stored

  • Some graph-like representation of the topology

  • Complete description of how to compute the potential energy (stored as a ForceField object?)

  • Atomic positions

  • Box vectors

  • Element information

  • Other tags for metadata and provenance

...

  • Lennard-Jones

  • “Lennard-Jones-like” (i.e. 14-7)

  • Buckingham (Exp-6)

  • Mie

...

Some limited support will exist for converting to objects in other packages. Most conversions No conversion will be lossless, although some edge cases will prevent this from happening reliably. This list but only in edge cases should conversions be prohibitively lossy, and in many cases only a partial view of the object is the target. Some target object may include any of the following:

...

  1. How much modification should be allowed? The software is much easier to implement if we force everything to be immutable, but user modifications (changing parameters, coordinates, connectivity, etc.) may be a valuable set of features. There are some options for a middle ground, like allowing mutability at some points but locking things down at certain API calls (i.e. writing out to disk).

    1. (My) general opinion is that some significant world-building should be enabled, but with clear guardrails in place.

  2. How to get an MM energy quickly? An internal evaluator would be tricky and do a lot of re-inventing the wheel, writing to disk and calling an an engine has some overhead.

    1. Exporting to and calling OpenMM is probably the path of least resistance, although exporting to other engines may be useful given other constraints. InterMol may be able to play a role here, if needed.

  3. Store data (OpenMM’s approach) or store instructions for getting data (just about every other engine out there). Storing just the data is arguably the richest information content, but requires guessing the instructions (or also carrying the instructions along as metadata) for doing most conversions.

    1. Majority currently seems to favor the “instructions” option

  4. Should systems be combine-able? This is a nice feature of ParmEd (big_structure = structure1 + structure2) but may be technically tricky to actually implement here. Re-phrased: how valuable a feature would this be? Can probably come back to it later.

    1. Yes!

  5. Should a ForceField object be tracked, as distinct from just tracking the parameters? This could enable features like writing a “just the parameters used in this study” OFFXML. There are likely some complexities to deal with, like information loss when actually applying a force field to a system.