2021-04-16 QCFractal User Meeting notes

Date

Apr 16, 2021

Participants

  • @David Dotson

  • @Jeffrey Wagner

  • @Trevor Gokey

  • @Simon Boothroyd

  • @Pavan Behara

  • @Joshua Horton

  • Ben Pritchard

Goals

  • Updates from MolSSI

  • Compute status

  • User questions, scientific needs

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Compute

David

  • DD: now handling both Lilac and PRP, Trevor on UCI

  • Completed some datasets this week

    • TorsionDrive benchmark update #173

    • OpenFF Aniline Para Opt v1.0 #200

  • Almost done with Amide Torsion Set v1.0 #196

User questions



  • JH: being able to get data into a QCFractal Server without using a manager

    • DD: been on my wishlist as well, need to pull out components of manager into a usable library

    • DD: would we want an uploader of existing OptimizationResult objects?

      • JH: that is of interest as well as another possibility

      • DD: could do both submission of existing OptimizationResult; this functionality would work for an executor

      • JH: bespoke-fit and QubeKit would both benefit from this

    • SB: would it be possible to use e.g. SQLite instead of Postgres with a Fractal?

      • DD: I believe we are using some postgres-specific features in the server, so may not be possible

      • BP: confirmed, jsonb usage (could be worked around), do have some column properties, but not a whole lot of other postgres-specific features

      • wouldn’t be impossible to use SQLite, but currently not a priority

  • TG: how close would having different storage options be to a client with persistent caching?

    • DD: the client refactor includes caching locally of complete results, so repeated calls to server of mostly-complete datasets won’t be slow

    • TG: could you take those cached results, or results period, and ship them to another server?

    • DD: since we’re moving in a direction away from working directly with ids for the user interface, this becomes more feasible

Infrastructure

 

  • Compute tagging via Github labels

  • Dataset index status

  • Industry Benchmark set submission too large

    • BP: any way you can batch this?

    • JH: think it’s bailing on .save()

    • DD: could getting process pool-based submission working address this, since it will include chunked saves?

      • JH: I think so

    • BP: limit is currently 50MB

      • The limit is imposed by nginx

      • will up to 100MB

    • DD: [committed] complete process pool batching of submissions

  • DD: on performance, client.modify_tasks appears to only take on id at a time, though suggests it should work otherwise

    • BP: working on fat object ability to avoid two-pass queries of optimization, then molecule

    • dovetails with performance and caching on client side

    • DD: we’re also working to e.g. make it so getting result status for a collection doesn’t require pulling the whole result; so lightening some calls as well

Action items

@David Dotson will prototype QCFractal submission pathway for optimizations and point calculations completed via QCEngine; likely requires pulling some manager code into a reasonable library
@David Dotson will prototype pull-push pathway via QCPortal client from one server into another
@David Dotson will finish process-pool implementation for QCSubmit submission
@David Dotson will create PR against existing FractalClient for client.modify_tasks taking multiple base_result IDs; currently doesn’t appear to handle properly

Decisions