/
2021-07-14 Industry benchmarks meeting notes

2021-07-14 Industry benchmarks meeting notes

Participants

  • @Jeffrey Wagner

  • @Joshua Horton

  • @Lorenzo D'Amore

  • @David Dotson

Goals

  • Updates from team

Discussion topics

Item

Notes

Item

Notes

Updates

  • DD –

    • Fixed RDKit backend bug, LD reviewed

    • I reviewed LD’s PR

    • Cut a release of openff-benchmark, that includes Sage rc1 as season 1:3, OPLS tree. Some complexity with getting sage release candidate packaged with it in single file installer. Would like to resolve that today.

  • LD –

    • Update on sage benchmarking: Not yet done. We (Janssen) may not be the best people to do this because of technical issues with workstation setups. So there’s a large error rate on these.

      • In the first MM run, we didn’t get all the molecules optimized. Out of 10,000 conformers, there were 500-600 incompletes. After resubmitting a bunch of times, I’m still seeing 350 incompletes. With other FFs, I got 99%+ complete in 10 resubmission cycles. I don’t have a sense for whether this is due to technical issues

      • Summary: in the first optimization shot, I got a high error rate. And I’ve needed to do lots of resubmitting and the error rate has been higher.

      • JW: perhaps the resources that these are running on have different memory requirements than previously?

        • JW – the AM1BCC calculation might be more demanding, using more resources, and something about Janssen’s compute resources have changed

          • LD – I ran these over the weekend, when the cluster should be free. Though one thing that could have changed is that, months ago, we asked to have better SLURM resource management. But some people submit jobs directly/using something other than SLURM. So we asked IT to limit resource usage or apply some sort of restriction to prevent this. But that was months ago, so I’ll ask if these restrictions got put into place.

          • DD – The message that LD saw for the failing jobs indicate that QCEngine failed to perform the optimization. But there should have still been a JSON written out.

            • JH – There’s a typo in the compute.py at line 858, where it should be resultjson=result.json

            • DD and JW will make a new release that fixes this

        • JW – Or, it could be that the Sage parameters are strange, and are leading to something like vdW crashes.

        • DD: will run the Sage RC on local resources against the burn-in set, observe if we see failures without a network FS, users, etc.

    • LD – I’ve been working on fixing a benchmarking bug, some minor fixes needed. https://github.com/openforcefield/openff-benchmark/pull/93

    • LD – I’ll try running these locally on my workstation, to remove any variables related to cluster behavior.

    • LD : working with Thomas Fox’s analysis to correlate shortcomings on molecules with specific FF parameters

      • preliminary

      • having a tool to run torsion scans needed to compare directly to e.g. QM

      • JW: think there’d be value in you joining the FF release calls; this work is very relevant as a feedback loop

  • JW : no big updates, will work with DD on release

    • can we pin down a date for the the bespoke-fit workshop JH? August 8?

    • JH: that works!

  • JH: holiday!!!

2021.07.08 release single file installer

Decided to proceed with building a new release of openforcefields, manually built by JW

Action items

@Jeffrey Wagner and @David Dotson will make a new release of openff-benchmark with fixes to JSON output for errored cases applied
@Lorenzo D'Amore will try running errored Sage optimizations on local workstation, rule out network filesystem issues
@David Dotson will run Sage optimizations on the burn-in set locally, observe and report error rate; will also test single-file installer built by @Jeffrey Wagner from manually-built openforcefieldsomnia package

Decisions