2022-11-08 QC Meeting notes

 Date

Nov 8, 2022

 Participants

  • @Pavan Behara

  • @Jeffrey Wagner

  • @David Dotson

  • BenjaminPritchard

  • @Chapin Cavender

 Discussion topics

Item

Notes

Item

Notes

Updates from MolSSI

  • BP: Running tests for qcf-next and everything seems good!

    • DD:

    • BP: I will cut a release earlier than I think and may run simultaneously old and new servers. Still contemplating options to host the server long term.

    • JW: Yeah, I started the conversation within OMSF. If we host our own we’ll start EOLing datasets, happy to start this conversation sooner.

    • BP – New version has more support for views, you can dump datasets into a SQLite file. Forward compatibility will still be complicated. New version will also allow for deleting datasets.

    • DD: Do you have granular permissions system for users?

      • BP: Not exactly, but deletions, etc., are restricted.

      • DD: I would prefer to have additional userid to just take care of dataset modifications like deletions, so that we won’t mess by accident.

    • BP – QCEl will also need updates. Not many other maintainers there, there’s LBurns and one other person who sometimes helps. The new postdoc, Mars, should also be working on it. We’ve got both postdocs now, working on onboarding. One will help with QCSchema, other will be using QCA on some applications.

    • BP – Was there a complaint about server performance?

      • PB – We’re comparing dipole moments from the benchmarking molecules.

      • DD – Ah, that’s right, but the exporter is slow. So there are 150k optimizations and it takes a long time to download. But we’re not committed to maintaining/updating openff-benchmark, so it took some configuring. Then BSwope asked whether there’s a faster way to get this and I said yes, but I don’t have the time to make custom solutions for them. So I looped PB in to this since he has some familiarity with the area.

      • PB – Right, I need to write a script for this, will aim to do it this morning.

      • DD – …

      • PB – Does the molecule data contain the company identifier?

      • BP – It may be in the entry data.

      • DD – So for each molecule, take the entry on the collection, and it will point toward the other objects that may have the name.

      • BP – Could be cool to offer a method that just pulls down all the final molecules. I could do this server-side pretty efficiently.

Infrastructure advances

  • JW – Industry user asked for a specific change to qcsubmit to allow running without psi4. So there will be a release soon. https://github.com/openforcefield/openff-qcsubmit/pull/206

  • CC - Advice on making a test for this PR:

  • https://github.com/openforcefield/openff-qcsubmit/pull/202

    •  

    • CC – Want to do a test on this, but many of the inputs will be too large to run this. So should I pull down an existing dataset or make a toy dataset? This is a blocker for the protein work.

    • PB – Are you already using this to do the 2D scans?

    • CC – Not sure I understand. This is only a problem when I try to retrieve results.

    • DD – Oh, I see. Sorry this sat for so long. This should be good to merge. I’ll merge it now.

    •  



Throughput status

  1. OpenFF Protein Capped 3-mer Backbones v1.0

    • Opts: 310894 → 311229 → stuck at this, might be a longer run

    • TDs: 20 → 22 → stuck at this

  2. OpenFF Protein Capped 1-mer Sidechains v1.2

    • TDs: 44 → 45 (remaining 1)

    • Stuck on 155215 for a week - might be a longer run or not

    • CC – I’ll take a look at these

    • DD – Should I leave lilac workers on? Or would it be better to have everything running on TSCC?

      • PB – Agree, let’s shut down all workers on lilac.

  3. RNA Trinucleotide Single Point Dataset v1.0 - Almost complete!

    • 57299 → 24362 → 7 remaining (might be longer runs, no status updates for the last three days)

    • DD – Let’s bump this to high-priority since we may be able to finish it off

    • (Later) DD – I tried running a worker locally and it’s not pulling down any of these jobs. BP, could you run your unsticking script?

    • BP – (Runs script)

    • DD – Awesome, a job just came through for me.

User questions/issues

(Copied notes from that meeting)

* BP – From torsiondrives and optimizations, could I delete meyer+wiberg except for first and last conf for each?

* TG – Yes, any time we’re looking at trajectories or torsiondrives, we don’t need intermediate bond info except at beginning and end.

( PB – I think it’s fine to delete meyer+wiberg info from those places

  •  

Science needs

  • PB – Once the next release is made, we should delete the iodine-containing molecules from gen2 set, since they were done with the wrong basis, and we have the corrected versions run in a different dataset.

    • DD – We should be able to do that once the next release is out

    • JW – Can we not supersede this with a higher-version dataset?

    • PB – That could work too.

    • JW – Let’s discuss at a later meeting.

    • PB – OK

    • BP – We can discuss deleteing older datasets as well.

  •  

 Action items

 Decisions