2021-08-18 Industry benchmarks meeting notes

Participants

Goals

Benchmarking Workshop on 9/1
Sage + OPLS coordination
Public dataset status and needs
Updates from team

Discussion topics

Item	Presenter	Notes
Sage + OPLS coordination		LD: announcement sent yesterday on S1, introduced Sage and OPLS as remaining items thinking I’ll send a follow-up pushing for Sage + OPLS optimizations across partners DD: sounds good! Would be good to prod folks periodically for these remaining items LD: Sage should be more mandatory, OPLS more discretionary, but aiming for getting as much as we can for the manuscript Sage results for all partners ideal from Gary’s perspective JW: Sage is just released, but only parameter names have changed from RC2; physics is the same, so think we can just proceed with RC2 results as “Sage” for the manuscript
Public dataset status		DD: backward-and-forward behavior on the QM public industry dataset; have an indication on what the problem is, but not a solution yet LD: exported about 60,000 QM results at Janssen for analysis so far DD: MM dataset practically complete; can start export at Janssen for downstream analysis
Updates from team		DH: new postdoc in Mobley’s group who will work on relative binding free energy calculations will give a presentation on this tomorrow DD: I would like to attend! DH – Can the free energy benchmarking refactor overlap with this? DD – Thanks for letting me know, I hope so – I’m trying to make the FE workflows general so they can be used in situations like this as well. LD: now doing torsion scans of violating torsions from analysis using `openff-benchmark torsiondrive execute-single` hitting errors, working through, but think it’s related to `geometric` may try to do torsion scans with another tool Example errors “Cannot continue constrained optimization, please use cartesian coordinates” “Unknown error”, LD – For previous executions, resubmitting a torsion scan resumes the calcs from the last point. Will torsiondrive single executions also allow this sort of restarting? DD – The torsiondrive-launch script uses some of the same code as openff-benchmark torsiondrive (eg td_api), which is a json-ified entry point to some of the other functionality, but it also has significant difference. td_api doesn’t have many of the same entrypoints as torsiondrive-launch, and I wouldn’t expect this to be accessible LD – It’s interesting because sometimes resubmission of a failed point makes it succeed.
Benchmarking workshop prep		DD – LD, do you have what you need? What can we provide? LD – Is there a template that we should use? JW: we have complete flexibility in how this is run want to plan for: at end of the workshop, what should attendees know, what sentiment should they have? JW: want them to know significant findings from public and private sets recap of the methods we used least important: future plans for benchmarking, Season 2, etc. JW: want them to feel engaged, be able to put hands to keyboard, do something cool a bit different with benchmarking, because they’ve already done that by participating in Season 1 will need to do some thinking on structure for this; perhaps an analysis like what Thomas Fox did problem is we haven’t been able to reproduce the issues he was seeing something interactive is valuable JW: if we were going to tell a story, what story should we tell? JH: how about the optimization issue with the big angle that was causing the crashes JW: that is a good one! Could cast that into an activity DD will share jupyter notebook from ff-release call where we found this issue DH: are calendar invites sent? JW: not yet, will follow up with Karmen to get event out DH: is this too short notice? JW: usually an ad board meeting there DH: it’s an hour longer though; Gary won’t be able to join, however JW: think it should be okay, will be recorded; partners notified via multiple channels, ad board meetings on timeslot JW: fine if this is only an hour extra hour is for overflow or interactive component DD: Could start with introduction presentation (goals, workflow details), and then running interactive session at the end. If interactivity ends up being really hard, then there’s not a big problem with dropping it. But if we DO plan for the interactive session, the talk should max out at ~30 minutes. JW: could choose a set of molecules, zipped up for ease of distribution perhaps distribute a simple analysis that can give them ideas beyond it, they can expand on LD: perhaps can draw on work Pavan is doing? JW: maybe choose an analysis where one part of it is poorly defined cluster (using some method, lots of choices here) bad RMSD cases DH – Could do analysis of the “most offending violations”, where we have people look at sets of molecules and find trends DD – Can we ship a non-psi4 single file installer for this? That way people can easily run it locally and continue using it afterwards. General – Which “simple analysis” should we showcase before we let them loose? JW – Correlate outliers / RMSDs with number of rotatable bonds? DD – Would be good to say what we want people to learn from the interactive session. JW: timeline day before, send them a zipped-up dataset, single-file installer, jupyter notebook (possibly also a script version of the notebook analysis for folks on remote hosts) could have a Google Colab setup for folks who can’t use the installer For folks that can’t run a jupyter notebook easily, could have a series of scripts they can run that produce equivalent file outputs LD: For dataset, random molecules, or from some partner? JW: random dataset may not produce any informative trends could choose specific chemical space subsets DH – Number of rotatable bonds? Rings? Hbond donors? Maybe random (maximize diversity)? JW: molecules that contain sulfur? DD – Let’s make a slack channel for this discussion JW made #benchmarking-workshop-prep

Participants

Goals

Discussion topics

Action items

Decisions