2022_07_28 Shirts/Madin/Davel/Wagner Check in

2022_07_28 Shirts/Madin/Davel/Wagner Check in


  • @Michael Shirts

  • @Owen Madin

  • @Connor Davel

  • @Jeffrey Wagner

Discussion topics





Tool for loading substructures and reading proteins

  • CD – Made tool to load monomer info and assign chemistry to…

  • MS – Show you’re showing how proteins could be done in this framework

    • CD – Yes

  • MS – Can this get out to an OpenMM simulation?

    • CD – I haven’t tried that yet

    • MS – A good test will be whether it crashes/simulates stably

    • CD – Agree, and I’d like to run this on the homopolymers that were provided.

  • MS – So if we did a protein with PEG grafted on, we’d just need to describe the residue where the joining happens, right?

    • CD – Mostly, yes.

    • (General) – It would be necessary to define the attachment point residue without one of the Hs

  • JW - why is “Met-special” needed? (MRS: instead of a cap?)

    • CD: normal met has one connection point, actually has a nitrogen with 3 hydrogen.

    • MS – This seems more like a capping thing than a new residue. For any atom can you add a H and increase the formal charge.

    • JW – Seems tricky, since the C term caps are either OH, O-, or nothing (where this capping definition works), but the N terminal protonation would change the chemistry of the monomer of itself.

    • MS – Also important to have both NH3 and NH2 be recognized, since both would be valid inputs. This could be seen as two separate things: Capping with an H (to become NH2), and then reacting to gain another H (to become NH3+)

    • CD – I don’t want to create protonation states

    • JW - NH2 and NH3 are reasonable caps. Do we want it to react after as well?

    • MS – Want a way to handle protonation and tautomerization. These are things that happen after polymerization.

    • CD – User could handle this using existing functionality. Substructure library would need to become 2-3x bigger (for different N and C terminal protonation cases)

    • MS- does this fit in?

    • JW - As a plug in. As a distant future. What will be helpful. Installed as a package. Here’s a method.

    • CD - helper functions to be rewritten later. Relies on networkx, but maybe rdkit could as well.

    • MRS will generate a list of PDBs that we can experiment with loading+minimizing

    • JW – Dye-bound protein files are at the end of our meeting notes here: 2022-07-07 Davel/Madin/Wagner check in

  • CD – Data formats?

    • JW – SMARTS look good. Not sure about separation of monomers and caps.

  • CD – We’ll probably want to answer “what are caps and are they necessary?”

    • JW – I know that “no caps, monomers only” works, because that’s what I have implemented, and it’s clear what’s broken when something breaks.

    • CD – Looking at an N term, it’ll either be NH3, NH2, or amide bond. The NH2 and the amide bond are identical as far as the nitrogen can tell. So we’d only need two forms of the N - Trivalent+neutral, and tetravalent+positive.

    • JW – Technically the NH2 and amide chemistries are different, but as far as kekule-land goes we can consider them to be the same.

    • JW – As a thought experiment, why do we consider different C terminal protonations as caps of the same residue, but GLU and GLH as totally separate residues? The chemistries differ in the same way.

      • CD – If we treat sidechain protonations as caps, the algorithm would need to search through 2^3 permutations instead of 2^2

      • JW – True, so I think that either approach can handle anything we throw at it. It just seems arbitrary to say that an OH->O- transformation in one place makes it a different residue, but an OH->O- transformation somewhere else in the same monomer is just a different cap.

      • CD – The guiding principle here might be user convenience. Users could totally avoid using caps by instead defining each different form of a residue to be an entire monomer on its own (leaving the capping section blank).

      • JW – Ok, that’s convincing that the “no caps” approach would be able to handle anything that the “yes caps” approach could.

      • (General) So the “yes caps” version of a substructure dict is kinda a compressed version of the “no caps/all explicit permutations” version of the same substructure database.

      • CD – Chemistry assignment using a substructure dict with no caps would effectively use the same code paths as from_pdb

      • JW – Probably a good way forward is to ensure that any substructure dict that is made with caps can be “uncompressed” to not have caps and accomplish the same job.


  • JW – Separately, we should kekulize monomer infos so that there aren’t any 1.5-order bonds. But that can be fixed in the future.

Action items
