Tool for loading substructures and reading proteins
CD – Made tool to load monomer info and assign chemistry to…
MS – Show you’re showing how proteins could be done in this framework
CD – Yes
MS – Can this get out to an OpenMM simulation?
CD – I haven’t tried that yet
MS – A good test will be whether it crashes/simulates stably
CD – Agree, and I’d like to run this on the homopolymers that were provided.
MS – So if we did a protein with PEG grafted on, we’d just need to describe the residue where the joining happens, right?
CD – Mostly, yes.
(General) – It would be necessary to define the attachment point residue without one of the Hs
JW - why is “Met-special” needed? (MRS: instead of a cap?)
CD: normal met has one connection point, actually has a nitrogen with 3 hydrogen.
MS – This seems more like a capping thing than a new residue. For any atom can you add a H and increase the formal charge.
JW – Seems tricky, since the C term caps are either OH, O-, or nothing (where this capping definition works), but the N terminal protonation would change the chemistry of the monomer of itself.
MS – Also important to have both NH3 and NH2 be recognized, since both would be valid inputs. This could be seen as two separate things: Capping with an H (to become NH2), and then reacting to gain another H (to become NH3+)
CD – I don’t want to create protonation states
JW - NH2 and NH3 are reasonable caps. Do we want it to react after as well?
MS – Want a way to handle protonation and tautomerization. These are things that happen after polymerization.
CD – User could handle this using existing functionality. Substructure library would need to become 2-3x bigger (for different N and C terminal protonation cases)
MS- does this fit in?
JW - As a plug in. As a distant future. What will be helpful. Installed as a package. Here’s a method.
CD - helper functions to be rewritten later. Relies on networkx, but maybe rdkit could as well.
MRS will generate a list of PDBs that we can experiment with loading+minimizing
JW – SMARTS look good. Not sure about separation of monomers and caps.
CD – We’ll probably want to answer “what are caps and are they necessary?”
JW – I know that “no caps, monomers only” works, because that’s what I have implemented, and it’s clear what’s broken when something breaks.
CD – Looking at an N term, it’ll either be NH3, NH2, or amide bond. The NH2 and the amide bond are identical as far as the nitrogen can tell. So we’d only need two forms of the N - Trivalent+neutral, and tetravalent+positive.
JW – Technically the NH2 and amide chemistries are different, but as far as kekule-land goes we can consider them to be the same.
…
JW – As a thought experiment, why do we consider different C terminal protonations as caps of the same residue, but GLU and GLH as totally separate residues? The chemistries differ in the same way.
CD – If we treat sidechain protonations as caps, the algorithm would need to search through 2^3 permutations instead of 2^2
JW – True, so I think that either approach can handle anything we throw at it. It just seems arbitrary to say that an OH->O- transformation in one place makes it a different residue, but an OH->O- transformation somewhere else in the same monomer is just a different cap.
CD – The guiding principle here might be user convenience. Users could totally avoid using caps by instead defining each different form of a residue to be an entire monomer on its own (leaving the capping section blank).
JW – Ok, that’s convincing that the “no caps” approach would be able to handle anything that the “yes caps” approach could.
(General) So the “yes caps” version of a substructure dict is kinda a compressed version of the “no caps/all explicit permutations” version of the same substructure database.
CD – Chemistry assignment using a substructure dict with no caps would effectively use the same code paths as from_pdb
JW – Probably a good way forward is to ensure that any substructure dict that is made with caps can be “uncompressed” to not have caps and accomplish the same job.
JW – Separately, we should kekulize monomer infos so that there aren’t any 1.5-order bonds. But that can be fixed in the future.