2024-10-10 Force Field Release Meeting notes

Date

Oct 10, 2024

Participants

@Lily Wang
@Alexandra McIsaac
@Brent Westbrook
@David Mobley
Chris Bayly
@Chapin Cavender
@Michael Shirts
@Pavan Behara
@Anika Friedman

Recording:

https://us06web.zoom.us/rec/share/_kBV3MPSXAU0aOhs4i7aEe2ne14DzLfIqPNjAD5q2bqMmmGDoY-0G54DA5uxMxXr.eInIMmdTcylfG585
Passcode: n#4^LCD?

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
NAGL1	LW	Plan to release NAGL 1 Industry benchmark HFEs (slide 3) CB: This is just a correlation, it’s easy to get a correlation when you span such a large range. CB: Would be useful to zoom in on non-zwitterion range, there are a few outliers off by 10-15 kcal/mol CB: We should look into outliers and characterize functional groups present, is it behaving poorly for certain groups? LW: Makes sense, I’ll do that Diastereomer HFEs CB: Openeye graph shows how close Diastereomer 1 vs Diastereomer 2, both from OE. Did you compare Diastereomer 1 (OE) vs Diastereomer 1 (NAGL)? CB: What matters is how well NAGL vs OE gives HFE differences, since the difference isn’t just from the charges CB: I was more wondering about comparing the same structure with different charge methods. Want NAGL to give same HFE for diastereomer 1 as AM1BCC does. E.g. how much of difference in HFE is due to difference in charge, vs other things LW: Good point, I’ll look into that LW: For specific molecules I have already looked into it, specifically those with large difference between Diastereomer 1 vs Diastereomer 2 [ slide 10-11] MS: This is kind of problematic for NAGL, would expect it to give an average between the two diastereomers, but it’s different from both for some atoms DM: Do you have Amber tools charges for these? LW: no CB: Looks like differences are ~1 kcal/mol, I’d consider that acceptable DM: So your plan is to do a NAGL release soon-ish, and this would be a future research topic? LW: Yes, want to benchmark stereoisomer problem, but it’s a whole research topic so don’t want it to block NAGL CB: I think NAGL looks good, HFE’s look good Similarity tool for identifying how close your molecule is to NAGL training data Approach 1: molecule fingerprints MS: What’s the use case? LW: People want to see how similar their molecule is to the training set MS: Would think the low end is more important than the high end LW: It’s true, but also don’t want false positives, because people will think their molecule is well described and it isn’t CB: As someone who would want this tool, if it comes back with a low similarity, I don’t care what the most similar molecule is since it’s not similar. If the similarity is high, I might be interested in seeing the most similar molecule, but would just generally take the metric as my info rather than the specific most similar molecule. From that perspective, it looked like the Morgan fingerprint was better CB: In low similarity case, would you agree that the most similar molecule isn’t helpful? Would NAGL be using charges from that kind of molecule? CB: Do you think this tool is useful? LW: A bit skeptical of utility, for example if we use charge ESP to determine “similarity”, it has no correlation with fingerprint similarity, e.g. may be very similar molecule but only one so it’s not well defined, vs many very similar molecules in training data would be very well defined CB: If you were a customer and wanted to have confidence in your molecule, is there something else you’d suggest? LW: In ideal world, would like something trained to and correlates with molecule and benchmark results, like HFE or Charge RMSE, but haven’t been able to get good results, and hard to interpret CB: Sounds like this might not be the easiest to use DM: In a perfect world where we had a cheap way to measure accuracy of final charges, I’d hope that it wouldn’t matter how you measure similarity, that the conclusion would be the same. Is that not the case? DM: Would you draw very different conclusions about how good the charges are depending on which similarity metric you’re using? LW: Using the two fingerprint methods, no, the conclusion is basically the same DM: Is there any way to look at charge goodness? Eg maybe pick a molecule where NAGL does poorly, and see if you can find a metric that identifies it. Can you break NAGL? LW: Yes, existing set of very small mols NAGL doesn’t handle well, other areas where it isn’t trained on certain chemistries, but unfortunately they do show up as having high similarity Approach 2: Fragment molecule before comparing to training set, since molecular size is a challenge Approach 3: Could make up our own fingerprint--didn’t try (yet) CB: HFE comparison between open eye and NAGL, choose similarity with fingerprint, and show error in free energy vs fingerprint similarity MS: Agree it’s a prioritization question, eg there are no tools to check whether a molecule is treated well by AM1BCC LW: Good suggestion, should be fast to code, and I suspect will show that no easy to compute/existing similarity metric will correlate with HFE LW: I worry that after doing some finagling with HFE or customized similarity, it still won’t show anything useful. Do we want to invest time in that, or just give them the fingerprint tool? MS: Would be nice, but many things would be nice…maybe just tell them to compare and calculate AM1BCC for comparison if the molecules are really different DM: Giving users a similarity score that doesn’t really mean anything isn’t useful, Maybe we should just flag molecules containing functional groups we don't describe well. CC: Alternative solution--lots of work, but I think it should be possible to get NN to spit out a confidence, may have to change architecture but could give users a confidence score DM: May worry that uncertainty would be really wrong outside training set? CC: Thinking of PLDTT (?) score from alpha fold MS: What’s our prioritization? LW: With NAGL1 very low priority, but something to think about for NAGL2 since we haven’t started training it Release plan for NAGL? Want to release it soon, because it’s starting to block people’s workflows. Is there anything else you’d like to see before we release it? MS, DM: No, let’s release it CB: Will we also release FF that’s paramaterized for NAGL? DM: First we just release it, in case people want it. Once we make NAGL the default charge model, we’d re-fit. Will be easier to re-fit if it’s released and integrated in our workflow MS: NAGL is so close to OE charges, not sure we’d need to re-parameterize, people already use Amber Tools instead of OE which is more different. Would be very different from Library Charges, e.g. for proteins PB: Would the pre-print be released with the NAGL release? MS/LW: Let’s write it soon, will have more time soon LW: Will move toward releasing NAGL as a full product in the next couple of weeks, together with infra team. Will prepare info for ad board regarding similarity tool.
Cofactors	CB	CB: how does OpenFF plan to handle metal cofactors? DM: we’re not really set up with this. For example, how would OpenEye handle a haem? CB: will come back with an answer to this MS: would be nice to just take the parameters from somewhere so it could just go through DM: we dabbled recently with ZAFF and it was painful CB: How can I elevate this as a priority? Make a pitch at the ad board? Talk to the governing board? DM: Question would be, what would you have us stop doing to work on this? CB: In spirit of original OpenFF, don’t want anything fancy, but just very minimal DM: My concern is that my group has done this recently with ZAFF and it was painful and didn’t really work, it seems to me that any “minimal” solution would be ~1 year of 1 science team member and ~1 year of 1 infra team member’s effort CB: I see, that’s more effort than I anticipated. Maybe we can have another meeting where we brainstorm and see if there’s another approach that doesn’t require so much person time. DM: Maybe you, I, and LW can talk further LW: Maybe someone from DM group can give a short presentation about problems with ZAFF? DM: Not sure it needs to be a formal presentation, but we can go into more detail offline. CB: Problem with zinc is that the “billiard ball” approach leads to zinc unbinding from the protein

Meetings

2024-10-10 Force Field Release Meeting notes

Date

Participants

Discussion topics

Action items

Decisions