Essential features
Can be serialized to dict and JSON
Atoms must
Have elements
Have masses
Have a topology indices
Be able to store a 4-character atom type name
Molecule representations must not require that they can be represented SMILES
Essential behaviors
These can either be explicitly part of the API or enabled by downstream code
Create iterators storing the indices of atoms of any valence term, indexed to the topology
i.e. a big list of tuples containing the indices of atoms in each bond - a little ugly at the moment since
Look up the following based only on an atom’s index
What molecule (if any) it is a part of
What residue (if any) it is a part of
What chain (if any) it is a part of
What other atoms it is bonded to
Ambiguous behavior/open questions
When molecules are converted to residues (i.e. in a conversion to Amber files), is that information (that this molecule is a molecule, not a residue) lost forever?
What information will acceptably be lost in round-trips with
OpenMM
MDTraj
other objects?
PDB files
other files?
Feature wishlist
Type annotations
Atoms can have masses not equal to their element’s mass
Atoms can have non-physical elements
Residues know if they include any other bonds to other residues
Atom type names limited to 4 characters
Fast generators and membership checks for atoms, bonds, angles, dihedrals,