2022-07-05 QC Meeting notes

 Date

Jul 5, 2022

 Participants

  • @Pavan Behara

  • @Jeffrey Wagner

  • @BenPritchard

  • @David Dotson

 Discussion topics

Item

Notes

Item

Notes

Updates from MolSSI and infrastructure advances

  • JW – Wondering about units for total walltime from two weeks ago

    • BP – I think it’s seconds, but it doesn’t take into account how many cores were used.

  • BP – Talked with Colton Hicks last week about a variety of topics. One idea was to use celery in QCF. Could strip down logic for distributed compute and claiming tasks.

    • DD – It makes sense if you need a lot of backend workers for a web service, but that may not be what we need

    • BP – Yeah, our tasks are pretty compute-heavy/can take days. Not sure that it’ll be well-suited for the heterogeniety of our workers (in terms of env setup)

    • DD – Also means that you’d need to get rid of the idea of DASK/parsl managers.

    • BP – I wouldn’t miss those.

    • DD – At OpenFF, we don’t use dask or parsl. Just pools instead

      • BP – I don’t think anyone really uses those, it’s more common to just submit several managers using pool. The one limitation is that a lot of clusters don’t let compute nodes access the outside world. So for those you’d need DASK or something to route the traffic through the head node.

  • BP – CHicks also talked about a package called “traffick”, made a containerized server using that.

  • BP – Next release server code seems to be done, needs to have more testing and docs.

  • BP – Working on a good animation for how QCFractal/QCPortal works. Could use in presentations/on the main webpage.

  • BP - QCF-next is feature ready, working on documentation and beta testing. Might expect release by end of september or later.

    • DD - Cool, that’d be great! I can put in the changes in qcsubmit to match with qcf-next changes.

  • BP - Two new PDs and one of them will work on getting Mopac gel with our QC infrastructure.

  • DD – I worked with MPI cluster folks - They expanded our walltime limit to 7 days, but in exchange our jobs are pre-emptible.

Throughput status

Chapin’s sets

  • OpenFF Protein Capped 3-mer Backbones v1.0 - 0/54 TDs complete. 160065 from 91884 opts in last two weeks.

Jessica’s set

  • OpenFF multiplicity correction torsion drive data v1.1 - remaining are consistent errors, will move to end-of-life.

SPICE sets: around 92K calcs last two weeks

  • SPICE PubChem Set 4 Single Points Dataset v1.2: 20 persistent errors, 6 stale jobs.

  • SPICE PubChem Set 5 Single Points Dataset v1.2: 122976 from 80892, around 21 errors and 153 stale jobs (incomplete).

  • SPICE PubChem Set 6 Single Points Dataset v1.2: 122641 from 31041, < 1K remaining.

  • PB – BP, could you run the stale job script?

    • BP – Yes, will do.

  • JW – Looks like there’s a trinucleotide set coming from Chodera lab. Not sure when this is coming and whether it’s higher-priority than SPICE set, but it’s good to know that we have some slack in the compute queue.

  • DD – Should we tell the SPICE submission folks that we’re ready for more work?

    • PB – Let’s do it next week. PEastman watches the jobs so he may contact us sooner.

  • DD – Was there a SPICE optimization set as well?

    • PB – They were considering doing one, but given how long the single point jobs are taking, PE suggested that it wouldn’t be worth the time.

 Action items

 Decisions