2021-06-30 Meeting notes

Date

Jun 30, 2021

Participants

  • @Jeffrey Wagner

  • @Connor Davel

  • @Owen Madin

Discussion topics

Item

Notes

Item

Notes

Reproducing Simon’s failure cases

  • CD – I wasn’t able to reproduce this problem, though I haven’t checked for isomorphism (just animated trajectories).

  • JW – RDKit’s conformer generation method changed between the 2020 and 2021 versions.

    • CD – I wasn’t able to reproduce the problem from the SMILES that SB sent using the 2021 RDKit, so I’ll try using the 2020 RDKit.

    • CD – I’m also not sure which version of the script SB used – I’m finding better results with the RDKit-based backend than the OE one that I sent.

RDKit based connectivity checks

  • CD – I’ve found a way to have AmberTools write out an SDF from sqm (in addition to writing out every step of in the optimization trajectory)

    • CD – After SQM is run, I run antechamber to go from the sqm.out file to a .ac file. The .ac file contains a guess at the bonds.

    • JW – Three questions:

      • Does this recognize actual connectivity rearrangements?

        • CD – Yes, I’ve seen cases of connectivity rearrangements that this workflow had caught.

      • Does this work for non-protein molecules? (AmberTools might be assuming things because these are proteins)

      • Does this disagree frequently with RDKit’s guesses, and what’s happening in those cases?

      • JW – This is really promising. AmberTools has has sneaky problems for me in the past, so I’d like to run both kinds of connectivity checks (rdkit and ambertools) to keep an eye open for disagreements

      • CD – This method of rearrangement checking would be nice because it only requires AmberTools

      • JW – the tool that may be getting used on the backend here could be bondtype

      • (General) – This isn’t using any information about the electrons, it’s just using elements+coordinates+total charge to guess the molecule graph

  • CD – I found cases where I could set sqm’s maxcyc to 0 (and include all the other keywords), and have OE read the resulting structure, and perceive a connectivity change, but RDKit reading the same file DOESN’T have that connectivity change.

    • CD – I noticed this first at the end of last week’s meeting, where RDKit and OpenEye were finding different connectivity for the same starting structure. I could go further into the cause of this problem if we want.

Initial partial charge plot review

  • CD – Setting only -ek maxcyc=0 as an argument to antechamber leads to all other sqm keywords being overwritten. This at first to me looked like really odd partial charges presented in the previous meeting – It’s not clear that there’s any AM1 happening here, or some really inaccurate numbers are coming out. The previous plots are invalid.

    • We’re fairly confident that setting maxcyc=0 when we DO specify all the other keywords will produce valid results. SB also does this in his prototype sqm-gemeTRIC interface.

    • CD found very close consistency between ambertools with maxcyc=0 and the final OE results for all conformers

    • CD will regenerate the plots from two weeks ago using the fix for maxcyc=0 inputs.

 

  • Are formal charges for input molecules being assigned correctly?

    • CD – I’m loading these from the mol2 files in the amber_ff_porting repo

    • JW showed that the carboxylates are being misinterpreted on file read, and how to use the carboxylate fixer utility.

Action items

Decisions