2021-08-25 Industry benchmarks meeting notes

Participants

@Lorenzo D'Amore
@David Hahn
@Joshua Horton
@David Dotson

Goals

Benchmarking Workshop on 9/1
- Live session
- Aggregated results to show
- Public dataset: which analysis / off method
- Pavan results with Fox's analysis
Sage + OPLS coordination
Public dataset status and needs
Updates from team

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
Benchmarking Workshop on 9/1	Lorenzo	Working on live session component chemistry from public dataset, example is a torsiondrive that is out of sync (violation) violations in molecule indices don’t seem to match what I observe from the SDF in Avogadro JH: looks like atom indices being reordered? DH: perhaps try out pymol and see if it gives same behavior? Did you at some point convert mol to SMILES and back? That could do reordering of atom indices JH: instead of `Molecule.from_smiles` use `Molecule.from_mapped_smiles` DH: looks like you can avoid the conversion to/from SMILES; best to do that if possible Aggregated results to show relative violation analysis with seaborn Executing torsiondrive will take too long during session, so perhaps have a pre-computed one? BRI-00593-00 JH: seen some things like this; one moiety turns into a weird triangle from the FF Public results with Fox’s analysis DH: thresholds for torsions must be carefully chosen. For some torsions, however, a violation beyond 30deg impossible given the multiplicity LD: providing notebook + dataset For slide deck: season 1 recap small molecule benchmarking, leading into and motivating live session LD: would it be better to do intro, show some season 1 results in live session? then give a talk on protein-ligand benchmarking at the end? Question: put protein-ligand benchmarking talk before or after live session? proposed schedule: Intro slides: season 1 recap Sage performance protein ligand systems small molecules analyzing problem cases from small molecules live session: remainder of time DH: aiming to have slides done by Monday latest DH: do we have any partners attending that didn’t participate in season 1 benchmark? DD: Maybe AbbVie? otherwise wouldn’t be too many perhaps include enough in the season 1 recap so folks aren’t completely lost LD: thinking to show aggregated results from season 1 benchmark, have Sage results from 2 partners (Janssen, Roche) so far should we include this? DD: I’d say yes, can use it as a motivator for additional partners to submit their Sage results! LD: which analyses? have four to choose from thinking going forward with just one: compare-forcefields due to speed worried about errors we might hit for these large datasets DD: I’ll stand by for on-the-fly fixes LD: aggregated results from public set? pulled the specs Gary wants for the publication smirnoff, openff-1.3.0, gaff, and openff Sage DH: perhaps just focus on the public dataset; that keeps it fairly simple for the presentation LD: will include mention of Sage and OPLS for internal results to motivate submission

Action items

@Lorenzo D'Amore will remove to/from SMILES conversion from torsion analysis

@Lorenzo D'Amore and @David Hahn will work to complete slide deck for Monday (8/30)

@Lorenzo D'Amore will work with @David Dotson on interactive workshop components

Meetings

2021-08-25 Industry benchmarks meeting notes

Participants

Goals

Discussion topics

Action items

Decisions