2021-08-03 SB/OM/CD/TG/JW Reference values check in

2021-08-03 SB/OM/CD/TG/JW Reference values check in

Participants

  • @Simon Boothroyd

  • @Owen Madin

  • @Connor Davel

  • @Trevor Gokey

  • @Jeffrey Wagner

Discussion topics

Item

Notes

Item

Notes

Overview of current progress

  • CD

    • Major goal is to fix/prevent molecule graph changes during AM1 optimizations. The common case of this is a proton migrating to a different part of the molecule. Plan was to look into restraint schemes that will prevent these in the first place. Goal was to find a compromise between speed and accuracy.

    • Two simple restraint schemes have been tried so far:

      • “maxcyc=0”: Not letting positions update at all in AM1 calculations (inner loop converges/makes electron clouds, but outer loop never iterates)

      • “smirnoff”: First parameterize molecule (with charges from maxcyc=0), then minimize parameterized molecule in MM, then run maxcyc=0 on the final geometry

    • Also trying out using GeomeTRIC, eg restraining bonds/angles.

    • Another big issue right now is finding “reference” values / what the partial charges SHOULD be. We had initially used OpenEye, but there’s evidence that some cases still experience proton migration/graph change.

      • Yesterday we met with TG to see whether there’s a good reference dataset that already exists for this, but we started going in circles over:

        • Should AM1 calculations try to match QM-RESP charges? What if the more detailed QM optimization also has a proton migration?

        • If we’re given a single conformer of a molecule, should we aim to reproduce the charges for that conformer, or would it be better to reproduce the charges for an ensemble of conformers?

        •  

  • (CD presents notebook)

  • CD – If anyone knows how to get the final geometries from OE, please let me know

    • SB – OpenEye has stated that they won’t expose this

  • CD – Yesterday I tried looking at the bond orders of the original bonds in the “bad” AM1 operations, and none of them seem to have dropped

    • SB – Good idea to look at that. I doubt OE is actually having proton migrations. They do torsion restraints, and some light bond/angle restraints. It could be that we’re seeing strong electrostatic interactions forming. But I’d dig into the extreme outliers on this plot and see if the resulting charges/bond orders are changing more than in other structures.

  • SB – Regarding the “reference valaues”, I agree that OE is a good starting point, but in some ways we should feel free to think that stuff is going wrong there and report outliers to them.

    • You could “validate” antechamber vs. OpenEye by running the equivalent of maxcyc=0 in OE. I’ll send you a code snippet for this. (OEAM1(optimize=False, symmetrize=False))

    • (SB from slack chat)

      oequacpac.OEAssignCharges( oe_molecule, oequacpac.OEAM1Charges( optimize=settings.optimize, symmetrize=settings.symmetrize ), )
  • SB – In terms of “good reference charges”:

    • Anything that AMBER spits out when you’re not restraining the conformer and you’re not getting proton transfers IS the gold standard. Actually, I’d like OpenEye to adopt this as the reference.

    • When there IS a proton transfer when we don’t use restraints, I’m interested to see a comparison of different methods.

      • In principle we could do hydration free energy/solvation free energy calcs from each set of charges, and use that to guide us.

      • Alternatively, we could bring in CBayly to review the results of using different schemes.

  • (CD shows histograms of partial charge differences)

    • SB – This is really great. In the “no geometry rearrangement” plots, I wonder if the “original” results should be the gold standard.

  • OM – Should we plan to implement “If there’s no connectivity change, then

    • SB – It may be good to analyze “number of hbonds formed”, and consider that as a criterion for applying restraints. Could do this using MDTraj/baker-hubbard rules.

    • CD – How much would we expect charges-from-a-hbonded-AM1-optimized-structure to affect the final results?

    • SB – DMobley studied this at one point, with the positioning of protons on carboxylic acids. They found that it has up to 1kcal/mol difference in hydration free energies. OE has built-in logic to sample both protonation states and select the least electrostatically interacting one.

  • JW – One question is “do we want to make a scheme that makes charges that reproduce HFEs, or do we want to make the “correct' partial charges for a given input conformer?

    • SB – I’d rephrase it as “what do we do when we find that a connectivity change has occurred?”

      • “How do we find connectivity changes?”

      • “How do we fix them?”

      • “How do we know that the fixed charges are still good?”

        • Could compare to OpenEye

          • SB – It may not make sense to compare with high-level methods - We’re looking for optimal AM1 charges, not DFT derived charges

        • Could compare to DFT-RESP

          • (General) – We probably don’t want this

        • Do they accurately predict HFEs?

          • SB – Not compared to absolute HFE accuracy, but rather “HFE accuracy compared to using charges from OpenEye”

    • TG – Should we look at “how many AM1 conformers are needed to reproduce DFT-RESP charges?”

    • CD – I look at it as “which charges most accurately reflect the geometries/connectivities that the simulation will experience?”. Because sometimes every single conformer undergoes the same change, and then you’re in trouble

      • SB – Agree. For this reason I don’t think we should pursue a strategy where we discard conformers.

    • SB – I think that OE’s reasoning was largely to use the input conformer, let it relax a bit to remove interactions that would introduce systematic error, and then accept the final charges. I largely agree with this philosophy.

  • SB – We should be looking into the sweet spot between “complete connectivity change” and “no movement at all”.

  • CD – So, then the goal isn’t necessarily to reproduce the averaged charge assigned to many conformers, but rather to do the best job we can on one conformer.

    • SB – I think that’s a good thing to focus on. For now I’d avoid the question of throwing conformers out. So we’d assume that other parts of the pipeline have carefully decided which conformers to feed in - Things get complex if we let this step double-guess the logic of another part of the pipeline.

    • CD – Agreed, and if someone is manually feeding in a conformer, we should assume they know what they want. So we could integrate this logic, and warn the user if a rearrangement happened.

    • SB – Agree – I’d like that, and it should print a warning saying that it’s automatically fixed the problem, and somehow store the detailed information if possible.

  • TG – Is the idea to switch our internal implementation to generate multiple conformers?

    • JW – Right now our workflows only make one conformer, though this could change if we adopt ELF in the future.

    • SB – There are some interesting quirks of OE’s ELF, where it seems to do the wrong thing sometimes if given lots of conformers.

  • SB – We can look deeper into methods for detecting connectivity changes - Basically, trying different automated methods, and reviewing disagreements by eye.

    • CD – I looked at a few ways to detect connectivity changes - One using QCElemental (like is used in QCSubmit) and another using RDKit. They agree 99% of the time.

    • SB – Those are both great. It may be good to see if WBOs (Wiberg Bond Orders) can be more informative that distance changes. Antechamber can also produce these.

    • CD – We’d looked at this earlier and decided to go with geometric analysis initially. Some concerns about charge transfers/tautomerization.

    • SB – Toolkit can pull out WBOs - I’d recommend not comparing exact values, but rather detecting the existence of bonds by whether they’re over a threshold. If molecule tautomerizes during optimizations then you’ve got bigger problems.

    • CD – If we have something like electron transfer over a long-ish distance, should we try to detect/stop that?

    • JW – If an electron moves, I think it’s analagous to different kekule structures representing the same molecule. That transition happens so much more quickly than 3D changes that we should WANT our partial charges to represent the result of that electron movement. But largely I agree with SB, that we don’t want to do a detailed analysis of WBO values, but rather just see if they cross under a threshold that shows a bond has broken.

  • JW – It’d be good to work this toward a end-of-summer summary presentation or poster

    • CD – I have a Shirts group presentation this Friday. I plan to continue pursuing this question past Aug 20, but that could be a good starting point for a project summary.





Action items

Decisions