Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion topics

Item

Notes

Updates

  • JW

    • HierarchySchemes and Elements

    mergedWorking on
    • prototyped

    • Working on TopologyMolecule deprecation

  • IP

    • Checked out CCTBX, this DOES apply residue names and stuff, but it DOESN’T apply bondorders + formal charges. Largely aimed at crystallographers with messy data. So one module handles PDB, another handles MMCIF.

      • Github link macro
        linkhttps://github.com/cctbx/cctbx_project

      • IP – Kinda tricky because it pulls in a bunch of dependencies

      • JW – Licensing issues?

        • (General) – It’s MIT or BSD3 licensed

Next steps

  • What loading pathways do we WANT to offer?

    • Load from Element + bond existence

      • Element PDB w/ CONECT

      • mmcif w/ bonds

    • Load from atomtyped representation with atom names matching a known typing scheme

      • Atomtyped PDB w/o CONECT (Perses entry point)

      • mmcif w/o bonds

  • What loading pathways CAN we offer?

    • (slow) Molecule.from_pdb (matching to residue templates)

      • Element PDB w/ CONECT → OFFMol

      • (would require more work) mmcif with bonds → OFFMol

    • CCTBX

      • Element PDB → Atomtyped PDB w/ CONECT

  • What prep method will people have used beforehand?

    • AMBER tleap protein prep → SDF that can probably be fixed

    • Schrodinger protein prep → ?

    • Chimera protein prep → ?

    • CCTBX cleanup →

    • PDBFixer → Atomtyped PDB w/ CONECT

    • Pymol mutagenesis wizard output → Atomtyped PDB w/ sometimes-messy CONECT

  • IP – We could speed up subgraph matching by splitting at peptide bonds.

    • JW – Agree. But how do we empower users to handle their own corner cases?

    • IP – Could let users add new residue SMILES

    • IP – Could let people match only a range of atoms for complex molecules

      • JW – This could work well, but then we may end up with partially-annotated molecules, and that could get really messy if people try to assign chemical information in different steps – When they try to convert to a full OFFMol, it’ll be hard to communicate which parts didn’t get bonds+formal charges.

  • JW will plan to have TypedMolecules optionally hold element, formal charge, stereo, and bond info, and potentially let them be upscaled to OFFMols if all info is present.

  • IP will try to speed up subgraph matching by splitting at peptide bonds. This will provide a prototype and early users to start providing feedback and finding corner cases.

  • IP – I spoke with DHahn the other day. I’ll be contributing a bit to the PLBenchmarks repo, and will probably also be involved in the continuous benchmarking efforts. I’m thinking about making a PLBenchmarks conda package.

  • JW – PLBenchmarks has a bunch of protein structures prepared in Schrodinger, so that will be a great source of example input data.

    • IP – CCTBX refused to read these, they probably violate the PDB spec in some way.

  • JW – I don’t think the biopolymer stuff will be in a major OpenFF Toolkit release in 2021. Instead, we should either direct people to do development builds from the branch, or I can make omnia conda packages from the topology-refactor branch

Action items

  •  

Decisions