Page Comparison

Participants

Goals

Updates from team
openff-benchmark release
openff-gopt executors
Public dataset MM compute status

Discussion topics

Item

Presenter

Notes

Updates from team

JH: QCSubmit now can convert datasets to QCEngine tasks directly or QCFractal inputs which should help with any future benchmark work we do if we use the qcsubmit dataset objects
DD: this is awesome, and should make it easier to take advantage of validation components in e.g. openff-gopt and openff-benchmark when we aren’t using a QCFractal server at all!
SB: can only do single-point and optimizations at the moment; waiting on QCEngine torsiondrive executor for that support
- could use a push on Ben to consider where in QCElemental the required models for torsiondrive procedure live
- DD: can do, will follow up with Ben
SB: been wanting to do orthogonal benchmarking, would want this to make its way into the benchmarking infrastructure
- JW: would like to get your thoughts down on this; after this week’s season 1 release can finally focus on next-gen functionality
LD: overall benchmarking of Sage RC2 (with a16 fix) appears to work well
- one issue relayed in Slack
- Re: Thomas Fox’s analysis
  - left it at the point where we get back the number of times a parameter is involved in a violation
  - see that parameter is producing many violations, but this is often a function of how often it is exercised; was then looking for a way to normalize the counts for better comparability
  - [LD shares current results via screenshare]
  - Normalization gives usable information on violation frequency for torsions, angles, bonds
    - see the most violations to examine in torsions, very few angles and bonds
  - t133 and t156 top violations among torsions
    - next steps?
    - DD – LD suggested running torsiondrives of those particular torsions and comparing energy profiles. LD, do you require any assistance running this?
    - LD – I should be able to do this. Also thinking fo looking quantifying disagreement between MM and QM, and adjusting plots based on the “severity” of violations.
  - LD – Can I regenerate the coverage report using Sage?
    - JW – Sage may have different parameter IDs
    - DD – You should be able to use Sage RC2 for the coverage report. https://openforcefield.atlassian.net/wiki/spaces/PS/pages/971898891/Optimization+Benchmarking+Protocol+-+Season+1#3.-Coverage-report
    - SB – There’s also water parameters in Sage, so those could be collisions. We are planning to change parameter ID names in the full Sage release. But we’ll provide a dictionary to map from 1.Y.Z to 2.Y.Z.
    - SB – May be good in the long run to print parameter IDs in the coverage report
- Re: 1.3 failing molecule; guessed connectivity
  - JW: there are two checks in the connectivity checking code
    - Github link macro
      link https://github.com/openforcefield/openff-benchmark/blob/5b26466a42a86fadcca2ab005bb940536a4fcb2f/openff/benchmark/geometry_optimizations/compute.py#L664-L685
    - One checks that the final molecule has the same number of bonds as the original
    - The other checks that every bond in the original is also in the final
    - Next issue seems to be that error cycling requires a ton of memory. I probably need to do more aggressive memory cleanup on error cycling machines.
- JW – Taking a closer look at failures
  - Failed case
  - Image Added
  - Successful case
  - Image Added
DD:
- Worked with BP on adding specs to existing sets. First problem was the size of the http request. The next issue (which we found last week) was with a limit in tornado.
- DD – Other dataset is ready to go.
- XL is very interested in getting an ANI torsion drive executor. We need to tread carefully here about getting gobbled up by maintenance costs and setting expectations correctly. XL has until the end of August to request software on the new cluster, otherwise he’ll be waiting for next year. I know this is kinda on our roadmap but there are other issues with ANI outside our organization that we will need pressure on.
JW
- BCP failures: Connectivity changes were detected. Maybe increase QCElemental threshold? Or use RDKit?
  - looks like for BCP cases, the distance between nonbonded carbons gets down to about 1.7A in failed cases
    - this gets misidentified by connectivity check as a bond, which yields a molecule that has too many bonds, which fails connectivity rearrangement check
    - one workaround is not running connectivity checker on MM export; not sure if that’s a straightforward approach, but might be the least disruptive
  - Three actions items:
    - 1) Let’s try reducing the threshold for QCElemental connectivity guessing
    - 2) Let’s open an issue to consider different connectivity checkers for the benchmark refactor
    - 3) I’ll work with Connor to compare different connectivity guessing methods.
- Sage RC2 should be ready to go later today.
  - have one more blocker to resolve, feedstock release issue
  - DD: we’ll meet today to resolve blockers and aim for release
- Arjun is interested in comparing 1.2.1 to new FF’s (2.0RC*)
  - SB: producing comparisons of 2.0RC* to 1.2.1 and 1.3.0 would be valuable. Shows the regression from 1.2.1 → 1.3.0, but then showing improvement in 2.0 beyond 1.2.1 would be valuable

Versions Compared

Old Version 1

New Version 2

Key

Participants

Goals

Discussion topics

Action items

Decisions