2021-05-25 Dalke/Wagner Onboarding

Participants

  • @Jeffrey Wagner

  • @Andrew Dalke (Deactivated)

    Here’s CN1C=NC2=C1C(=O)N(C(=O)N2C)C

Discussion topics

Item

Notes

Item

Notes

General onboarding

First tasks

  • JW – Generally want to make our interfaces with OpenEye and RDKit more reliable and better documented. This would take the form of providing clear documentation of the expectations of each (implicit Hs? Aromaticity? Mol fragments/radicals/dangling bonds? Metadata?)

    • I think the best start to this work would be to find a better molecule test set than MiniDrugBank.sdf, round trip the molecules through both OE+RDK, and check for differences and whether they’re significant.

      • AD – Could get mols from Pubchem (OE-generated), Chembl (RDKit generated), and ChEBI (other sources)

      • Hit time limit here: Will continue first task speccing tomorrow

  •  

    • AD – There’s one file called toolkits.py with the toolkit wrappers and toolkit registries. I’d prefer to have them in separate files/modules, and have users explicitly call which one they want. Want to add better logging and provenance reporting.

    • Also, if there’s some way to determine whether the SMARTS matching could be different for the different toolkits

      • AD – Difference in meaning of “RN” – This is already documented in issue tracker, I tokenize this in my code I’ve worked on.

    • Final notes on this would live in method docstrings and the developer documentation

      • Source

      • Rendered https://open-forcefield-toolkit.readthedocs.io/en/latest/developing.html

    •  

Action items

Decisions