Gather software requirements from Perses team for Topology Refactor.
Discussion topics
Item
Notes
JC – Important thing is to have a full chemical representation (currently OEMol), mdtraj topology (currently mdtraj), and openMM topology. Break residues out to molecules.
Aspirational API notebook demo
JC – PDB has been deprecated, so make sure that anything we can do with PDB can be done with MMCIF or other supported formats
JC – PDB substructures have some standard connectivity defined. In the chemical component dictionary.
IP – Will use pattern matching using something like SMARTS
JC – When loading the PDB, remember that you can’t do SMARTS matching. So you either need to do a more limited “template” matching. So use either chemical component dictionary, or another officially-specced source from RCSB.
JC – If we load from a “common components dictionary”, then we don’t need to specify percieve_residues(nucleotides=True)
JC – Treat hierarchy perception with a factory convention. Maybe consistent residue “flavors” that are accepted by lots of different API points.
JC – There’s some python convention where you don’t have to treat metadata as a dict, and can instead just operate on direct attributes of molecule.metadata
MH – Are you literally using MDTraj for the selection?
IP – This will drop the OpenFF topology to a copy as an MDTraj topology, and then allow
JC – Is the residue-attachment a realistic use case? I’m not sure that it is, and I’d use SMIRKS for this.
How to do chemical transformation?
JC – This is more complicated than what we do. We load the beginning and end point as separate SDFs, and then run a MCSS to get the mapping. So we don’t need in-house support for these sorts of modification.
DR – Are we sure? What if I’m transferring from one unnatural AA to another?(not in the chemical component database)?
JC – So, if we have a representation of a molecule before attaching to an amino acid… This seems unlikely.
JC – Why not use SMIRKS?
DR – Would a single MCSS match provide all the info we need to call this method?
JC – Most common operation we need to do…
JC – hybrid_molecule seems to have the info we want, but we’d expect it from a much better API. So you’ll need to commit to exposing a much bigger API for this to be useful
IP and JW will follow up with DR on hybrid molecule API needs
JC – The thing I really want is “take this alanine and mutate it into glutamine”, and handle the MCSS and everything yourself.
IP – Since people will be all over the place for what they expect from residues/hierarchical substructures, we shouldn’t put it in here.
Two strategies for residue replacement
Minimal changes – Keep the backbone and make the minimal number of replacements
maximal changes – Cut out entire residue and replace with the other
Use case for Perses
Coming in with two separate complete SDFs and a mapping between them
Take one Topology, remove the old ligand, put in the new ligand and possibly an atom mapping. As a bonus, get back a hybrid molecule that maps the old and new ligand.
Coming in with a protein, and switch to a different protomer/tautomer
Load a biopolymer, identify residues, ask OFFTop to mutate one residue into another, slice out the before- and after- residue, and use OEChem to do the MCSS so that Perses can control the atom mapping.
Coming in with a protein, switch one residue to another from the chemical components dictionary
Coming in with a protein, switch one residue to another NOT from the chemical component dictionary
Cut a residue out of a protein, with or without caps. Possibly with caps informed by “undoing” the polymerization reaction (and allow reverse trips).
DR – We don’t have to supply hybrid topology. Return atom index mapping. OE doesn’t have this functionality.
DR – OpenMM topology – don’t know what happens with that – ended up using mdtraj tops.
DR Said he could try to comment Perses' code to point out where they want drop-in replacements for OpenFF topology object.
These can be detected by thinking on simplifying the parts where they simultaneously need to use/track OpenFF, OpenMM, and MDTraj Topology objects.
Add Comment