...
Some graph-like representation of the topology
Complete description of how to compute the potential energy (stored as a
ForceField
object?)Atomic positions
Box vectors
Element information
Other tags for metadata and provenance
...
Lennard-Jones
“Lennard-Jones-like” (14-7)
Buckingham (Exp-6)
Mie
Electrostatics
Do we need to care about anything more than storing the partial charges?Partial charges
(optional) formal charges
Valence potentials
Harmonic bonds
Harmonic angles
Proper torsions
Improper torsions
Exceptions
How to store? InterMol and ParmEd explicitly track Explicitly track non-bonded exceptions (i.e. scaled 1-4 terms interacitons) for all particlesparticle pairs
Constraints
Combining rules
...
CMAPs
Urey-Bradleys
Virtual Sites
Polarizability, Dipoles, Multipoles
Serialization
Lossless serialization is provided through exporting a system object to Python dictionaries, from which JSON, messagepack, and other serialization formats are available.
...
Single, high-level container object that “contains” everything (
System
) that contains sufficient data to compute the potential energyAt a low level, things become
Molecule
objectsMolecule
objects may be de-duplicated through someMoleculeType
object
More specific
Molecule
subclasses can be used to (optionally?) encode physical meaningProtein
,Ion
,Ligand
Biopolymers treated with existing conventions (residues and chains)
...
How much cheminformatics data should be stored? Some data (bond orders?) may be lightweight but we don’t want to duplicate efforts that already exist in the toolkit and are not useful for MD engines.
Manipulation of systems
Combining systems: Systems will be combine-able in a similar manner to the popular ParmEd feature (
new_structure = structure1 + structure2
)
Interfaces with machine learning libraries
...
As such, the primary interface will be from the system object to various formats and objects, not the opposite direction. By contrast, reading input files is a desired feature, but is a low priority.
Important details about how molecular simulations are executed are not in scope. The OpenFF System object will fully describe the structure of the potential energy function energies, but not how to calculate it in the context of a molecular simulation, i.e. propagating a molecular dynamics trajectory. For example, the choices of barostat, timestep, and ensemble are left to the researcher.
Internal data structures will be remarkably general, but not infinitely so. The primary use cases will be in the domain of computational biophysicsorganic chemistry, specifically implementing the SMIRNOFF format at the molecular scale. A number of scientifically interesting systems will not be supported initially, although efforts will be made to avoiding prohibiting future extensions to do so. Thing includes things like coarse-grained models, multi-body potentials, anisotropic pair potentials, and rigid body.
...