2022-07-26 Meeting notes

 Date

Jul 26, 2022

 Participants

  • @Pavan Behara

  • @Jeffrey Wagner

  • @David Dotson

  • @Chapin Cavender

 Goals

  •  

 Discussion topics

Item

Notes

Item

Notes

Updates from MolSSI

  • BP is offline today.

Infrastructure advances



Throughput status

  • OpenFF Protein Capped 3-mer Backbones v1.0 -

    • 2/54 TDs complete. 232048 from 220051 opts in last week.

    • CC – This seems like steady enough progress for this dataset. It seems a bit odd that the ones that have finished are ALA and LYS, so I think the progress isn’t based on molecule size but rather random failures.

  • RNA Single Point Dataset v1.0: 4489 calculations x two QC specifications (our default, and wb97m-d3bj/def2-tzvppd)

    • default - 8864/8978, remaining 114 are SCF convergence issues

    • wb97m-d3bj/def2-tzvppd - 682/4489

      • DD: PRP pods are failing with over-consumption of cores

      • PB: I tried running a few UCI workers with zero success rate: 48 cores/180GB, 36 cores/180GB. Ran out of scratch space in one calculation, I had tmp-dir size of 100GB, increased it to 200GB and testing again, raised memory to 360GB as well.

      • DD – Did you see CPU or memory usage spiking leading to the crash? When PRP breaks, it usually does so with “OutOfCPUError”, but I’m not able to watch it live.

      • JW – Possible that memory reqs increase with number of CPUs?

      • PB – I’ve seen the queueing system adjust other parameters based on the number of CPUs

      • JW – What are your lilac job settings for the successful runs, DD?

        • DD – QCEngine is told it has 10 cores, requesting 12 from scheduler, QCEngine is told about 70GB, requesting 84GB RAM from scheduler. Things seem to be progressing but I’m not sure what the Lilac error rate is.

        • PB – I don’t see any errors from lilac, so this looks clean.

      • We’ll just run these on lilac since they’re such difficult jobs

User questions/issues

 

Science support needs

  • JW – CC, do you have a rough plan for when you’ll run forcebalance to make the initial FF?

    • CC – Hoping to run tests of those in the next two weeks, on TSCC

    • PB – If you use the most recent OFF toolkit using ELF10 (OFFTK 0.10.6), it will take a while.

    • CC – Understood, I’ll monitor how it goes on TSCC.

    • PB – Great. I’ll send you the new FF with my proposed changes before then.

    • JW – There could be ways to speed this up if needed, but it will require getting into the weeds of ForceBalance, so let me know if the current state is unworkably slow.

 Action items

 Decisions