2021-08-04 Industry benchmarks meeting notes

Participants

@David Dotson
@Lorenzo D'Amore
@Simon Boothroyd
@Jeffrey Wagner

Goals

Updates from team
openff-benchmark release
openff-gopt executors
Public dataset MM compute status

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
Updates from team		JH: QCSubmit now can convert datasets to QCEngine tasks directly or QCFractal inputs which should help with any future benchmark work we do if we use the qcsubmit dataset objects DD: this is awesome, and should make it easier to take advantage of validation components in e.g. `openff-gopt` and `openff-benchmark` when we aren’t using a QCFractal server at all! SB: can only do single-point and optimizations at the moment; waiting on QCEngine torsiondrive executor for that support could use a push on Ben to consider where in QCElemental the required models for torsiondrive procedure live DD: can do, will follow up with Ben SB: been wanting to do orthogonal benchmarking, would want this to make its way into the benchmarking infrastructure JW: would like to get your thoughts down on this; after this week’s season 1 release can finally focus on next-gen functionality LD: overall benchmarking of Sage RC2 (with a16 fix) appears to work well one issue relayed in Slack Re: Thomas Fox’s analysis left it at the point where we get back the number of times a parameter is involved in a violation see that parameter is producing many violations, but this is often a function of how often it is exercised; was then looking for a way to normalize the counts for better comparability [LD shares current results via screenshare] Normalization gives usable information on violation frequency for torsions, angles, bonds see the most violations to examine in torsions, very few angles and bonds `t133` and `t156` top violations among torsions next steps? DD – LD suggested running torsiondrives of those particular torsions and comparing energy profiles. LD, do you require any assistance running this? LD – I should be able to do this. Also thinking fo looking quantifying disagreement between MM and QM, and adjusting plots based on the “severity” of violations. LD – Can I regenerate the coverage report using Sage? JW – Sage may have different parameter IDs DD – You should be able to use Sage RC2 for the coverage report. Optimization Benchmarking Protocol - Season 1 \| 3. Coverage report SB – There’s also water parameters in Sage, so those could be collisions. We are planning to change parameter ID names in the full Sage release. But we’ll provide a dictionary to map from 1.Y.Z to 2.Y.Z. SB – May be good in the long run to print parameter IDs in the coverage report Re: 1.3 failing molecule; guessed connectivity JW: there are two checks in the connectivity checking code One checks that the final molecule has the same number of bonds as the original The other checks that every bond in the original is also in the final Next issue seems to be that error cycling requires a ton of memory. I probably need to do more aggressive memory cleanup on error cycling machines. JW – Taking a closer look at failures Failed case Successful case DD: Worked with BP on adding specs to existing sets. First problem was the size of the http request. The next issue (which we found last week) was with a limit in tornado. DD – Other dataset is ready to go. XL is very interested in getting an ANI torsion drive executor. We need to tread carefully here about getting gobbled up by maintenance costs and setting expectations correctly. XL has until the end of August to request software on the new cluster, otherwise he’ll be waiting for next year. I know this is kinda on our roadmap but there are other issues with ANI outside our organization that we will need pressure on. JW BCP failures: Connectivity changes were detected. Maybe increase QCElemental threshold? Or use RDKit? looks like for BCP cases, the distance between nonbonded carbons gets down to about 1.7A in failed cases this gets misidentified by connectivity check as a bond, which yields a molecule that has too many bonds, which fails connectivity rearrangement check one workaround is not running connectivity checker on MM export; not sure if that’s a straightforward approach, but might be the least disruptive Three actions items: 1) Let’s try reducing the threshold for QCElemental connectivity guessing 2) Let’s open an issue to consider different connectivity checkers for the benchmark refactor 3) I’ll work with Connor to compare different connectivity guessing methods. Sage RC2 should be ready to go later today. have one more blocker to resolve, feedstock release issue DD: we’ll meet today to resolve blockers and aim for release Arjun is interested in comparing 1.2.1 to new FF’s (2.0RC) SB: producing comparisons of 2.0RC to 1.2.1 and 1.3.0 would be valuable. Shows the regression from 1.2.1 → 1.3.0, but then showing improvement in 2.0 beyond 1.2.1 would be valuable

Action items

@David Dotson will poke Ben for review on QCElemental#268

@Lorenzo D'Amore will run torsiondrives locally on proprietary molecules with high violations in torsion profiles

@Jeffrey Wagner will work with @Connor Davel to compare different connectivity guessing methods

@Jeffrey Wagner will create an issue for reducing the threshold for QCElemental connectivity guessing, see if this is feasible

@Jeffrey Wagner and @David Dotson will release next openff-benchmark

Meetings

2021-08-04 Industry benchmarks meeting notes

Participants

Goals

Discussion topics

Action items

Decisions