Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion topics

Item

Notes

Create molecules PR

  • AD – Opened

  • JW – Approved

Molecule test set planning

  • What were the ChEMBL failures?

    • SDF-format 2D mols with implicit Hs

      • Should these be loadable into OFFTK?

        • If they encode the a specific stereoisomer with unambiguous protonation state, then yes (even if we don’t have coordinates for the protons)

  • Characteristics of a good test set:

    • 100 - 300 molecules

    • Exercises stereo differences from issue #146

    • No implicit Hs in 3D file?

      • Implicit Hs are OK as long as stereo is defined

    • Stereo in connection table vs. 3D

    • Different stereo features

      • Github link macro
        linkhttps://github.com/openforcefield/openff-toolkit/issues/146

      • Github link macro
        linkhttps://github.com/openforcefield/openff-toolkit/issues/725

    • Multiple components (should lead to error)

    • 2D structures? Only if they unambiguously define stereo in connection table

    • Coordinate-less structures?

    • Structures where connection table stereo contradicts 3d?

Performance in test_forcefields

  • AD – The biggest timesink is the conversions to OEMols and RDMols. It’s possible to cache on toolkitwrappers, but it may be better to have a ChemicalEnvironmentMatcher object that a molecule/topology can generate. This would work because they’re immuatable once they’re in a topology.

Decision tree for loading mols

  • Molecule has no coordinates (reading from SMILES, mapped SMILES, SDF or mol2 with all-zero coords) or molecule has 2D coordinates (reading from 2D SDF or mol2)

    • Hydrogens are explicit

      • Stereochemistry fully defined in connection table according to OFF definition

        • Load molecule

      • Stereochemistry incompletely defined in connection table according to OFF definition

        • UndefinedStereochemistryError unless allow_undefined_stereo=True

    • Some hydrogens are implicit, but the existence of all hydrogens is unambiguous

      • Stereochemistry fully defined in connection table according to OFF definition

        • Load molecule, with implicit hydrogens getting (0,0,0) as coords

      • Stereochemistry incompletely defined in connection table according to OFF definition

        • UndefinedStereochemistryError unless allow_undefined_stereo=True

    • Hydrogens are implicit and ambiguous

      • AmbiguousProtonationError

  • Molecule has 3D coordinates (reading from 3D SDF or mol2)

    • Hydrogens are explicit

      • Stereochemistry IS defined in connection table according to OFF definition

        • 3D stereochemistry AGREES with connection table stereochemistry

          • Load molecule

        • 3D stereochemistry DISAGREES with connection table stereochemistry

          • UndefinedStereochemistryError unless allow_undefined_stereo=True. If molecule IS loaded, set stereochemistry data fields to the connection table stereochemistry.

      • Stereochemistry IS NOT defined in connection table according to OFF definition

        • Stereochemistry can be guessed from 3D coordinates

          • Load molecule, setting stereo to what’s indicated by 3D coords

    • Some hydrogens are implicit, but the existence of all hydrogens is unambiguous

      • Stereochemistry fully defined in connection table according to OFF definition

        • Load molecule, making hydrogens explicit and setting their coordinates to (0,0,0)

      • Stereochemistry incompletely defined in connection table according to OFF definition

        • UndefinedStereochemistryError unless allow_undefined_stereo=True

    • Hydrogens are implicit and ambiguous

      • AmbiguousProtonationError

Action items

  •  

Decisions