| |
---|
General updates | CD – Code update - I’ve thought more and think that caps are more of a convenience item and aren’t strictly required. So right now the code supports them, but the work I’m doing right now doesn’t make use of caps - It instead enumerates all permutations of residue capping and adds those explicitly to the JSON. CD – (Matching algorithm is complex) JW – Some of the complexity of the matching algorithm/assignment algorithm comes from the fact that we’re assuming that one substructure can be a subset of another substructure. … JW – For example, if there’s an atom with two possible charge states, but everything else around it is the same, the molecule may be totally unloadable form PDB MS – User does need to have an answer for what charges are where, like must be able to provide a mol2. CD – Could use residue names as a tiebreaker if there are multiple matches MS – I’m in favor of simplicity. If there are residue names then the chemistry can already be matched to a template. Though it kinda depends on whether the residue is functionalized, and then it depends on whether we require just the addition to have a different residue name or if the entire modified AA should have a special residue name. JW – I don’t think we should ever consider residue name in this stack. Other ecosystems already do this. CD – Could make user declare charge state for every instance of an element We’ll move forward assuming that residues CAN be substructures of other residues
CD – I was able to make substructure libraries for all polymers except vulcanized rubber and polystyrene
|
Jupyter tool demo | Dye-bound AAAAACAAAAA - Algorithm only recognized ALA on one side of the cysteine initially, and then recognized it on both sides once CYX backbone was defined. (General) – It would be great for the visualization to show just things that can’t be recognized CD – Could do this, it might take up a lot of space. Still need to run on Nate’s case. MS – Yeah, Nate’s case has 3 monomers but multiple ways to connect. One of Suben’s(?) projects would be a good test too. Lots of variable connection points on those as well.
MS – How can users interface with this tool? CD – User can click around to create new monomers, provide SMARTS strings, or load from SDF. JW – It would be great to extend the
CD – (Shows terminal group functionality - A terminal group is a monomer unit attached to an unrecognized substructure with no bonds to other recognized substructures. The notebook will currently identify these for users and ask them to fill in the chemical info)
|
| CD will extend notebook to get all the way to simulation for one monomer. This will use gasteiger charges, and will warn users that these charges are not suitable for real work (it will provide a link to the LibraryCharges docs for further work). The substructure dict should be provided for that monomer, in case the user doesn’t want to run the previous steps. CD will make “unrecognized atoms” visualization only highlight things that can’t be recognized by ANY tiling of substructure, not just ones that aren’t recognized by the best tiling.
|
Previous to-dos | MRS will generate a list of PDBs that we can experiment with loading+minimizing CD will try loading all existing homopolymer PDBs with two connection points (except vulcanized rubber) Check/fix what happens to protein C terminal oxygen charge (or make terminal group behavior more robust) Extend tools to be able to help users “debug” PDBs that can’t be fully loaded - Like, if they try to load a PDB with a PEGylated amino acid, the program could output a view of the unrecognized atoms and some context, and ask the user to fill in the missing info. It’s fine if this doesn’t use the interactive GUI, but rather just outputs 2D images to PDF or something. (Kind of overlapping with above) Show a polymer that is partially recognized, but highlight the parts that weren’t recognized in a particular color. Then let the user assign some info to an unknown region, save that as a substructure, and then reload/rerender the polymer with the new monomer appended to the substructure library and repeat.
|