Interoperability Requirements [WIP]
Essential features
Object model and all components can be serialized to (at minimum) dict and JSON
Object model and all components can be hashed
Atoms must
Have elements
Have masses (think “isotopes”)
Have a topology indices
Be able to store a 4-character atom type name → JW – Can provide a “standardize” method that will cut atom names/types down to 4 characters
Molecule representations must not require that they can be represented SMILES
Multi-residue structures can be treated as individual molecules
Essential behaviors
These can either be explicitly part of the API or enabled by downstream code
Create iterators storing the indices of atoms of any valence term, indexed to the topology
i.e. a big list of tuples containing the indices of atoms in each bond - a little ugly at the moment since
Look up the following based only on an atom’s index
What molecule it is a part of
What residue (if any) it is a part of
What chain (if any) it is a part of
What other atoms it is bonded to
Ambiguous behavior/open questions
When molecules are converted to residues (i.e. in a conversion to Amber files), is that information (that this molecule is a molecule, not a residue) lost forever?
What information will acceptably be lost in round-trips with
OpenMM
MDTraj
other objects?
PDB files
other files?
For all converters, JW and IP should make tables like those for the Molecule core properties, showing which data is preserved and how fields are converted.
What assumptions are made about each components?
Connectivity within/between molecules and residues?
SMILES-ability of molecules (JW: NetworkX graph hash – We will provide at least an atom-order-dependent solution for this)
Non-element/isotope/bead “atoms”
Feature wishlist
Type annotations → JW – Will the return values of atomtyped molecules' inherited methods correctly indicate that they’re atomtypedmolecules, or will their return signature just indicate the base class?
Atoms can have masses not equal to their element’s mass
Atoms can have non-physical elements
Residues know if they include any other bonds to other residues
Atom type names limited to 4 characters
Fast generators and membership checks for atoms, bonds, angles, dihedrals, propers, impropers