Page Comparison

Date

28 Sep 2021

Participants

Discussion topics

Item

Notes

Updates

JW
- HierarchySchemes and Elements
mergedWorking on
- prototyped
- Working on TopologyMolecule deprecation
IP
- Checked out CCTBX, this DOES apply residue names and stuff, but it DOESN’T apply bondorders + formal charges. Largely aimed at crystallographers with messy data. So one module handles PDB, another handles MMCIF.
  - Github link macro
    link https://github.com/cctbx/cctbx_project
  - IP – Kinda tricky because it pulls in a bunch of dependencies
  - JW – Licensing issues?
    - (General) – It’s MIT or BSD3 licensed

Next steps

What loading pathways do we WANT to offer?
- Load from Element + bond existence
  - Element PDB w/ CONECT
  - mmcif w/ bonds
- Load from atomtyped representation with atom names matching a known typing scheme
  - Atomtyped PDB w/o CONECT (Perses entry point)
  - mmcif w/o bonds
What loading pathways CAN we offer?
- (slow) Molecule.from_pdb (matching to residue templates)
  - Element PDB w/ CONECT → OFFMol
  - (would require more work) mmcif with bonds → OFFMol
- CCTBX
  - Element PDB → Atomtyped PDB w/ CONECT
What prep method will people have used beforehand?
- AMBER tleap protein prep → SDF that can probably be fixed
- Schrodinger protein prep → ?
- Chimera protein prep → ?
- CCTBX cleanup →
- PDBFixer → Atomtyped PDB w/ CONECT
- Pymol mutagenesis wizard output → Atomtyped PDB w/ sometimes-messy CONECT
IP – We could speed up subgraph matching by splitting at peptide bonds.
- JW – Agree. But how do we empower users to handle their own corner cases?
- IP – Could let users add new residue SMILES
- IP – Could let people match only a range of atoms for complex molecules
  - JW – This could work well, but then we may end up with partially-annotated molecules, and that could get really messy if people try to assign chemical information in different steps – When they try to convert to a full OFFMol, it’ll be hard to communicate which parts didn’t get bonds+formal charges.
JW will plan to have TypedMolecules optionally hold element, formal charge, stereo, and bond info, and potentially let them be upscaled to OFFMols if all info is present.
IP will try to speed up subgraph matching by splitting at peptide bonds. This will provide a prototype and early users to start providing feedback and finding corner cases.

IP – I spoke with DHahn the other day. I’ll be contributing a bit to the PLBenchmarks repo, and will probably also be involved in the continuous benchmarking efforts. I’m thinking about making a PLBenchmarks conda package.
JW – PLBenchmarks has a bunch of protein structures prepared in Schrodinger, so that will be a great source of example input data.
- IP – CCTBX refused to read these, they probably violate the PDB spec in some way.
JW – I don’t think the biopolymer stuff will be in a major OpenFF Toolkit release in 2021. Instead, we should either direct people to do development builds from the branch, or I can make omnia conda packages from the topology-refactor branch

Versions Compared

Old Version 1

New Version Current

Key

Date

Participants

Discussion topics

Action items

Decisions