Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Meeting recording - https://drive.google.com/file/d/1u6xsZRB3tBzitZFmTvcSfTQJMWB7wqNw/view?usp=sharing

View file
nameProblemCompoundsSDF.tar
View file
nameOFFChallengeCompounds-Version7.pptx

...

Item

Presenter

Notes

Representative poor relative conformer energy profiles in Genentech benchmarking with Sage 2.0.0

Bill Swope

Post slides here and recording here

  • (slide 6) MG – One thing I’ve noticed is that the approach to the minimization strongly affects the final structure. Did you consider using restraints?

    • BS – That’s kinda an intended outcome - If the force field doesn’t recognize the QM minimum that’s a bad thing.

  • BS – Proton transfer cases made it through benchmark workflow -

    • JW – That’s unfortunate - something must be up with connectivity filter.

  • (slide 17) AG – Seems like this could be the effect of angle terms being a little too stiff.

    • DM – CBayly fretted a lot about angle values in rings, to ensure we get the right stiffness.

    • AG – Are angle terms different for atoms in rings?

      • DM – I don’t recall. But I do remember that we wondered about different parameters for different sized rings.

      • AG – I’d hope that the FF could figure that out. If you assume a ring, the ring is enforces by connectivity+bond length.

      • DM –

      • AG – I’m thinking that the out of ring angles may be too stiff

      • LW – For the molecule shown on this slide, it wouldn’t get any ring-specific angle parameters.

      • DM – So maybe for flexible terms we should differentiate that.

      • PB – I think we have specific angle parameters for 3- and 5-membered rings.

  • (mol 00971)

    • Possible sources of error are:

      • large dipole on molecule

      • stiffness of angles in 5-membered ring

  • (mol 00553)

    • AG – Aromatic sulfur-amide question comes up a lot in other FFs as well. This is clearly challenging.

  • 00946

    • MG – I wonder if we just have a shortage of sulfur training data

  • 00066

    • PB – We saw one of these where CCSC was almost linear on one side, and then out of plane on the other

  • 00660

    • Maybe something related to CCSC torsion where some Cs are aromatic?

  • DM – I was expecting to see more sulfonamides

    • BS – These were to act as a showcase - We have lots of examples of sulfonamides as well. We also had targets that were all-orphan which might arguably be worse.

    • MG – It’d be good to do an analysis of which parameters appear in these bad structures.

  • DM – I can think of two broad approaches to this - 1) parameter splitting/chemical perception based refinement, and 2) Training/testing dataset generation. Those will feed in well to our FF creation/fitting pipeline.

  • BS – You’re pointing out that there are point problems that could be fixed - I’m pointing out that these metrics should be included in training.

  • MG – I know that we’d been talking about this quite a lot - Basically in things like torsion drives, how do we measure energy/structure deviation.

  • BS – I wonder how these metrics are factoring into training.

  • PB – When we use the torsion profile target, we we try to match to the minimized profile. JH pointer out that we could match….

  • DM – What about opt geo?

  • PB – We can imporve that if we include the internal coordinate differences, it kinda improves the

  • DM – I don’t think to ddEs at all

  • PB – Right, we fit to relative energies from torsion profiles, not optimized geometries. For optgeo we train on geometry but not energies.

  • AG – Is that an opportunity to add, or would that make it more complicated?

    • PB – We could add that.

    • AG – Would also need to make a database of test cases.

    • DM – We have lots of data already, but this may help us construct a good, well-defined test set.

    • AG – I’m not sure if a “well defined” set is possible - This is where the FF has to consider “diversity” - And you could use the entire benchmarking set for that.

    • PB – I think we could test the idea to fit to rel energies among optgeo.

    • BS – So one thing would be to fit to rel energies in matched conformers. And to

    • PB – Right.

    • AG – I think this can bring in a few things that are not represented in torsion profiles, like through-space interactions and ring conformations.

  • DM – Could we get the list of orphans as well? I’d want to check for parameter enrichment for those.

  • LW – I’d made a method that tries to figure out which energy errors come can be attributed to which parameters, so I’d be interested to try running on the chemistries that make lots of orphans.

  • DM – We generally try to avoid structures with a lot of steric congestion since those make it hard to figure out the sources of error.

    • PB – I’ll send DCole my tool for findign sterically congested structures.

  • AG – Could we do a sensitivity analysis to see which parameters are related to high ddEs - Like to find which situations have instances of parameter application pulling the parameter values in different directions.

  • PB + LW – We can work with BSwope to support this analysis.

Update on current status of charge projects

Lily Wang

  • LW – Been working on a few different charge projects. Basically GCN charge models, vsites, and refitting BCCs. Began with a refactor of Simon’s former code - It evolved over time and older scripts didn’t work any more. Didn’t have a ton of docs either so refactoring was a good way to learn the code/methods. That’s basically done and I’ll make that public at some point. Which is necessary becuase we need to make that public for other researchers to try it out. So I’ve recently been able to get the entire workflow running wtihout critical bugs/crashes. I need to iterate a bit since we’ve found that throwing a lot of data at the problem doesn’t generally end well. I think I’ve figured out good hyperparameters for fitting, including handling resonance structures. Hard to give a timeline but I’m hoping to see better results by November. I expect to benchmark “better” using solvation free energies. While I’ve mostly focused on GCNs I’ve also worked a bit on vsites, but there’s also a fitting data problem there.

    • MG – Training against QM ESPs?

    • LW – Vsites yes. For GCNs we’re training directly to AM1 charges as discussed before.

    • AG – Have you looked at comparing to QM dipole moments?

      • LW – Yes, but we don’t have that implemented yet.

    • MG – You think the model will be able to figure out the resonance forms eventually?

      • LW – Yes. Right now we’re enumerating all resonance forms which is somewhat expensive.

      • MG – What method for generating resonance structures?

      • LW – I think we’re using the solution from vcharge. This is what SB had implemented in NAGL and that’s what I’m continuing from.

      • MG – There are certain molecules where resonance gets really complex, but I think that’s rare.

      • LW – It only adds a few hours per iteration, which isn’t that much in the grand scheme of things.

    • JW – I’d be cautious about splitting effort between GCNs and vsites before either is in production - Would be good to go fully toward one before the other.

    • LW – I’m interested to get GCNs implemented in the OFF toolkit.

      • JW – Happy to work with you on this, should be straightforward.

    • MS – Happy to provide assistance with testing, juet let me know.

      • LW – Thanks.

water models, non-mainline FFs, and calendar versioning in openff-forcefields?

Matt Thompson (backup Jeff Wagner)

continuing from 2022-09-14 Meeting notes

  • Seems to be strong consensus for switching the repo “version” to “calendar versioning”, where releases are of the form 2022.09.14 instead of 2.0.0. The file names would remain unchanged (Sage = openff-2.0.0, Rosemary = openff-3.0.0, etc), but the git tags, python module, and conda packages would reflect the “calendar version” instead of the “force field version”.

  • How to version “ports” of external FFs/FFs whose reference implementations are defined elsewhere?

  • Would we want to add different ion models as well? Would we bundle them with compatible water parameters? How would we define compatible?

...