2021-11-10 Industry benchmarks meeting notes

Participants

  • @Lorenzo D'Amore

  • @Jeffrey Wagner

  • @David Dotson

  • @David Hahn

Goals

  • Manuscript needs

  • Sage results status

  • Season 1 feedback survey announcement

  • Updates from team

Discussion topics

Item

Notes

Item

Notes

Manuscript needs

  • LD – good news, Thomas Fox had an issue with toolkit returning multiple molecules when reading SDF, likely due to multiple conformers in a single SDF among his files

    • JW – what was he doing that there was a file containing multiple molecules?

    • LD – good question; not clear, though he may have concatenated some SDFs himself and had them in the directory

  • Once we have his results, we’ll have Sage results from 10/10 partners!

  • LD – OPLS data for public data is complete!

    • do have one issue, not getting successful RMSD comparison between OPLS custom results and b3lyp-d3bj/dzvp cases

  • JW – GNT-00245 has a SMILES mismatch because one representation is shown as an aromatic SMILES, while the other is kekulized. (Also for some reason the aromatic version has a negative charge on an N, instead of the neighboring O)

  • Other issues are with sulfurs – Showing as [S2+]([O-])([O-]) instead of [S](=O)(=O)

  • JW – think we have 3 categories of things here

    • warnings from differing SMILES

      • think these are okay; what would really scare me is the molecules having different total charge. But these are just different kekulizations.

    • cases with no rotatable bonds from WCS molecules

    • the hard error in match minima

  • JW – think there is something like a 40% chance that when we’re reading we’re using a different aromaticity model than when we’re writing

    • the only thing that’s allowed to move around bond orders, formal charges

    • toolkit uses an older aromaticity model; if we load it without specifying that aromaticity model, such as just using RDKit, that will move things around

Updates

  • LD – fragmenter usage for torsiondrives on problematic torsions

    • have a mapping of indices from parent molecule to fragment

    • as a next step need to put into torsiondrive input

    • question: need to select additional atoms around rotatable bond to define full torsion

      • JW – use find_rotatable_bonds, choose first that features the central bond you care about

      • can also call Molecule.propers, take the first one that has the central bond you are interested in

  • LD – seeing two fragments that are the same coming out of fragmenter, but with different indices

    • unclear why these are being produced

      • reason is that dataframe LD is using has a row per violating rotatable bond, for which a single fragment may apply to multiple such bonds

    • also see a few examples of two fragments with identical indices

    • looking at a case where indices not the same, have a large molecule with two instances of same fragment, probably with canonicalized indices on the parent molecule

Survey

  • DD – Any blockers/objections to sending this out?

    • JW – This looks good to me

  • DD – LD, can you spearhead this? Anything else to announce?

    • LD – We should announce that we have 10/10 results submitted

    • DD – Let’s do them as separate announcements so people don’t gloss over the survey.

Action items

@Lorenzo D'Amore will try to reproduce the conformer-matching bug; schedule a working session with Jeff for Friday morning (PT)
@David Dotson will issue announcement on Season 1 feedback survey
@Lorenzo D'Amore will announce 10/10 Sage results once we’ve received final results from BRI

Decisions