Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion topics

Item

Presenter

NotesNotes

General updates

  • JW – No meeting next week – Thanksgiving

  • CD – Finals coming up, will be largely unavailable in Dec.

    • We’ll keep our meeting on Dec 9, organize by DM if needed

  • JW – I’ve merged the parameter deduplication PR and made a new release of the AMBER FF repo. Thanks!

“Next steps” from last meeting

  • Why does the data range end at exactly -1.0? Is something getting filtered out by the workflow? Maybe try putting cyanate in (C#N-)

  • CD will make a minimal reproducing example

    • CD will send SMILES of good and bad mols to OM

  • CD will try running just AM1 optimizations to see which deviations persist (if the parallel offset lines disappear, it’s a BCC issue)

  • CD will turn off symmetrization to see if the vertical lines disappear.

  • JW will look into whether substituted indoles (indoles with extra Ns) are known cases of disagreement between aromaticity models/chemical representations.

Project updates

OE vs AT AM1-mulliken, never using optimization, with symmetry on and off

Image Added

Only taking BCC contributions from each toolkit

Image Added

  • CD: looks like BCC corrections are a problem. They vary a lot between OE and AT by molecule.

    • JW – It’s weird that this is looking like a continuum on the y=x line. I’d expect it to look like a really clumpy distribution that wither looks like a grid or some vertical lines

      • Simon reverse-engineered BCCs here:

        Github link macro
        linkhttps://github.com/openforcefield/openff-recharge/blob/master/openff/recharge/data/bcc/original-am1-bcc.json

    • LW – There are 141 unique BCC values (charge values) for SMIRKS involving N in some way (either including or excluding), so it may look smoother than JW thinks. There are 251 unique BCC values in total.

  • We get perfect agreement between OE and AT with no-symmetry and no-optimization and no BCC (i.e. AM1). A small error (difference) comes in when we tell OE to perform symmetrization. A large error comes in when we introduce BCCs.

    • JW hypothesis: AMBER is not loading the molecules properly. It’s not reading the symmetry properly and not applying BCCs correctly. Or OE is changing the molecular graph after we give it to QUACPAC.

    • LW: I wonder if RDKit is assigning bond orders correctly.

      • “While it is possible for RDKit to erroneously label these properties for some molecules (particularly for nuanced concepts such as aromaticity, as shown by the rightmost “aromatic” molecule as classified by RDKit, which is in fact not aromatic) […] As an example, aromaticity is a concept that RDKit acknowledges is difficult to capture algorithmically, and thus may be misclassified for some molecules (molecules which have their aromaticity broken by steric strain is an intuitive example).” (doi.org/10.1021/acs.jcim.1c00519)

  • JW – How does this tie into the overall goals of this project?

    • Give Connor research experience

      • If this is a technical bug, then it’s not a great research project.

      • JW and/or LW should probably write the bug down somewhere searchable.

    • Get the toolkit to get more consistent charges between backends

      • Looks like we could use ChargeIncrementHandler BCCs with the AM1 keyword (and no symmetrization if possible) to get more closely identical outputs

    • Determine whether ELF1 is better than random conformer selection.

      • We should be able to do this using the toolkit change above.

Using no optimization, no symm, random confs

Image Added
  • Using no optimization, no symm, ELF1 confs

    • Difficult to draw conclusions yet as much fewer molecules in this graph than in above

Image Added

JW – This looks great. The RMSE has dropped a ton.

CD – The second plot only has part of the dataset complete. I’ll ping you when it’s complete.

Action items

  •  

Decisions