2021-07-22 Force Field Release meeting notes

Date

Jul 19, 2021

Participants

  • @Pavan Behara

  • @Jeffrey Wagner

  • @David Mobley

  • @David Dotson

  • @Lorenzo D'Amore

  • @Simon Boothroyd

  • @Joshua Horton

  • @John Chodera

  • @Matt Thompson

Goals

  •  

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Sage rc1 industry benchmark failure rate

@Lorenzo D'Amore @David Dotson

  • DD: Around 5% error rate in MM optimizations observed with Sage RC1 on Janssen internal set, I found 3.2% on the open burn-in set, which is higher than previous FFs.

  • LD – Initial failure rate on Janssen set had about 5% error rate for MM molecule optimizations.

  • DD – Digging into it, we found that the failures were due to geomeTRIC failing to converge

  • JC: Is geometric a blocker on many of our optimizations because of tighter convergence criteria?

  • DD: The problem is not geometric related actually in this case.

  • LD – Even in DFT, we got these to converge. But it is generally known that linear molecules are troublesome.

  • DD: There is a wiggling in the supposed-to-be-linear atoms' angles and parameter a16 is the rootcause.

  • JC: Is the onus on the FF parameters to handle singularities or the tools we use like the MM engine should handle these? Also, force balance enforces value ranges during optimization?

  • DM – We must have missed this in preparing the input.

  • JC – Is this sufficient? Could engine developers implement a change that would resolve this?

    • JW – I wouldn’t want engine developers to make assumptions about torsion linearity. But maybe, if an angle equilibrium value was set to more than 180, it could be “reflected”, so that something like 183 becomes 177

  • DD – I checked and this fixes all errored cases

  • JC – So, the solution seems to be that

    • we should introduce a mathematical parameter in the FB fits to keep this from exceeding 180

    • The geomeTRIC optimizer is too sensitive, though that

  • JC: The parameters are trained on a dataset and we shouldn’t be keep changing these manually, there should be something within the optimizer to handle these?

    • JW: I am not quite sure where we should police these on the energy evaluation end or during FF optimization

  • JW: I am glad that we had this new feedback loop with industrial benchmark.

Possible outcomes of free energy benchmarking



  • DM – If things are about the same or better, we should go ahead with the current release candidate. But if they’re significantly worse, then we should NOT release.

    • SB – Agree

  • JC – Should we use sage-rc1 for COVID moonshot?

    • SB – Let’s base this decision on how Vyutas’s benchmark looks.

  • DM: You might be more interested in the bespoke workflow

    • JC – Is ANI ready for use on the bespokefit backend, or do we still need to do complete QM torsion scans

    • SB – We’re trying ANI out, it looks promising.

Openff-interchange

 

  • DM: Chris & Gaetano from OE might be interested to try out openff-interchange as soon as it’s out, so we can loop them in on testing

    • MT: Are they looking for proteins? It may take some more time to support this.

    • DM: I think it’s fine, we can get Gaetano involved in testing current features

    • MT – I’m confident about behavior of small molecules, but not biopolymers

    • SB – I think it’s worthwhile to get users on proteins, just ensuring that they know that thy won’t get metadata.

    • JW: OE may need some more intermediate data output for analysis and CBy or GC can let us know what they’re looking for

Action items

Decisions