Atlassian uses cookies to improve your browsing experience, perform analytics and research, and conduct advertising. Accept all cookies to indicate that you agree to our use of cookies on your device. Atlassian cookies and tracking notice, (opens new window)
JW – Polymer loading functionality is implemented in OE, sitting in a PR right now. I’m hoping that I can look at this again after the annual meeting and get it merged. This doesn’t yet handle noncapturing atoms but we can add that in a later release.
Harder starting point: Same as above but only with AllDipeptides from the same folder
Input PDB:
Ideal output: something very similar to existing substructure dictionary, with each pattern looking like a single AA substructure
CD – Is it useful to generate substructures in the first place?
JW – In the context of PDB loading, they’re useful in a few ways
Inspectability
“caching” expensive isomorphism computations - If we can train on a “minimal” polymer, then that may avoid really nasty scaling issues with isomorphisms on larger polymers
CD – The substructures that are generated for a small PDB are only guaranteed to work on the small PDB file.
Interoperating with existing substructures (eg, protein with one unnatural AA)
JW – If we get this working in a user-firendly way I’ll be really happy, even if we don’t end up with a substructure for the modified AA
Previous to dos
MS will send workflow from collaborator group that takes a mol2 and capping atom indices as inputs, and produces a polymer.
Done
CD will draft a project page with goals in order and specified milestones
CD will experiment with an automated process that handles “monomer information type 1”. Ultimate method signature will be make_substructures([monomer_info_sources], [pdbs_to_load])--> substructure_information. The output format isn’t super well specified, but should have equivalent information content to CD’s current substructure dict format (with noncapturing atoms allowed)
To dos
CD will try to “learn” amino acids using instructions above
CD will determine whether there is a real need for substructures as an output, or whether it’ll always be find to just go from PDB to SDF without a substructure dictionary as an intermediate
CD will try loading all existing homopolymer PDBs with two connection points (except vulcanized rubber)
Action items
Decisions
, multiple selections available, Use left or right arrow keys to navigate selected items