2021-09-01 Benchmarking followup workshop discussion notes

Discussion topics

Item	Notes

Item	Notes
	AGobbi – When running OPLS3 with FEP+, did you use ffbuilder? DH – Yes. Not all the FEP+ calcs were run by me. I took them from the published results, including from C Schindler and Janssen people. XHou – Re: benchmarking metrics slide (27) – Have you considered comparing the shape of the curve? In pfizer we compare the shape of the entire torsion scans. LD – We are looking to include torsion scans in the openff-benchmark command tree, but it’s not yet fully implemented. SB – We’ve started to do some crude analysis of torsion energies both in terms of shape and magnitude. Right now we’re looking at normalizing torsion barrier heights and then compare RMSE. DH – Work so far has been single points on the torsion energy surface AG – Also slide 27 – Do you have plots for these alternative comparisons? LD – The histograms showed before are from the “compare forcefields” path. I haven’t had time to make histograms for the swope and lucas analyses. I do have the plots for the Janssen proprietary datasets. AG – Re slide 29, did these reports of failures in 1.3 lead to direct changes in the way that openff-2.0 was trained? LD + DM – No. And there are still more reports from thomas fox that we haven’t fixed. AG – Can we run the torsion deviation code on our own structures? LD – Yes, I’ll find T Fox’s code and share it. Also, today’s interactive session will show some of his DH – Additional analyses? TFox – Torsion profiles. Conformational ensembles (basically whether the FF is good enough that I don’t need to do expensive QM things). AG – Don’t have the conformational ensemble data from the current datasets? TF – I’m not sure that the current dataset has all the energy info that would be needed to do the compairons I want DM – More details? TF – I’d like to see whether a metric of ranking for relative conformer energetics. JW – Something like a rho or a tau would be a great idea DH – Agree. Also the use of the word “ensemble” makes me think about “MD”, and maybe that could be a good analyses, to see if the clusters from ligand MD capture the energy landscape well. TF – That would be good, though it does seem challenging.
Interactive session	TF – Is the interpretation that t17 has 260 violations, and 35% of when the parameter is assigned it’s there will be a deviation of more than 30 degrees. LD – We can’t exactly interpret this number directly. But it’s a weighted metric. DM – Could we assume that “1” is a baseline for regular frequency of violation? JW – Two things – Note that this is a biased dataset. Also, a single central bond can have many uses of a parameter, so we can’t make a special meaning of a value of 1. TFox – t157 is a `---` torsion. What’s going on there? (General) – We’ll look further into this AGobbi – It would be great to have a commandline tool that runs over a whole dataset. LD – Agree JW – We’ll aim to put this in season 2
Followups	Jeff will upload video Jeff will upload slides to zenodo Notes for other workshops? Good to have a non-presenter for interactive session to manage pacing, since it’s hard to do a live demo and keep track of time.
Josh session prep	What do we want people to know by the time this is over? Say bespokefit exists - it’s for bespoke parameter derivation, here’s some details/math about how it works Currently focused on torsions, planning to expand to other terms in the future. But you’ll get the most reliable/highest value use from bespoke torsions. Possible backends (psi4, ANI, XTB, etc) Does produce accurate results? JH has previous slides about this CAN I use this in production? No, it still errors a lot, largely due to use of QCFractal. This session should focus on a one-off use of it for a molecule/under conditions that we know work SHOULD I use this in production? HOW do I use this in production? How do we want to allocate time/effort? Post notice on website Interactive demo Make a repo like the benchmarking repo Run locally or on binder? If local install is likely to fail, let’s do binder Return at 26 past Use QCA data? Or XTB? Could do two parts of demo: one using QCA data Maybe predownload jsons into repo one using XTB Could be failure-tolerant / have other part of notebook work even without. Show output OFFXML file, and run gas phase MD Could look at torsion RMSE. Make plots that show improvement in outcome compared to Parsley/Sage parameters. Could reuse JACS ligands in QCArchive, show improved torsion profiles compared to Parsley/Sage Jeff will set up a blank repo for Josh to start populating

2021-09-01 Benchmarking followup workshop discussion notes

Discussion topics

Action items

Decisions

Related content