2021-11-19 QCA Submission meeting notes

Participants

Goals

Updates from MolSSI
Compute
- updated prod envs to use latest qcelemental, qcenegine, qcfractal
- QM workers on Lilac
- QM workers on Newcastle
- QM, ANI, XTB workers on PRP
New submissions
- submission issues
- dipeptide dataset
- imposed electric field in QM - wavefunction not stored for optimizations
- ML datasets for OpenMM
  - multi-molecule issues with QCElemental
User questions/issues
Science support needs
- new openff-qcsubmit release
Infrastructure needs / advances
- psi4 on conda-forge

Discussion topics

Item	Presenter	Notes
Updates from MolSSI	Ben	BP: been deep in refactor-land; excited about directions for things working on single-point components, going better than I thought it would definitely pleased to have the space to do this kind of work now through beginning of new year JW – Rough ETA? We have a big API break coming in Feb BP – Anticipating release in Feb, alpha in Jan BP – There was a QCEl release last night, but this should be OK since I was careful with pinning. DD – Sounds good. We just updated prod envs and pinned to 0.23
Compute update		DD – Updated the prod envs, bumped lots of versions QC* and openff-* versions. Only old package we’re still using is pytorch, we’re staying old on this since torchani hasn’t had a release in the past 1.5 years. . JW – wanted to check if we should use <= pins since we know that OpenFF and MolSSI will have big API breaks in Feb DD – The prod envs all use exact pins DD – JH, PB, CC, did you update managers recently? JH – Yes PB – Yes CC – I can do that this afternoon DD – Where all do we have production workers running? I know we have Lilac, PRP, Newcastle PB – We have public QCA workers on UCI HPC3 as well. Though they’re pre-emptible so they may not finish stuff. I’ll note this on the #qca-compute channel. DD – Do we need more of any type of workers? PB – The XTB jobs aren’t proceeding. DD – I don’t understand what’s going on with these. I’ll investigate these moving forward. (PR #237) JH – May be worth turning off error cycling, in case there are some jobs that will always fail, that will “clog” up the compute every time they get resubmitted. DD – Sounds great. I’ve turned off error cycling on the XTB dataset for now, will turn it back on in a few days. I’ve also started more XTB workers on PRP.
New submissions		submission issues OpenMM datasets PB – We have issues with datasets being very large. The PubChem set has 100k+ molecules for single-points. So I’ve chopped this up in different ways but even the smaller datasets aren’t submitting. DD – Let me try manually accessing the dataset from #243 on my computer. I’d previously run `ds.compute` locally to get it submitted. But there’s some issue with batching where the GHA isn’t connecting/able to access it. PB – Do you think there’s a problem with the GitHub Acitons user API limit that’s causing this? DD – I don’t think so. Previously we’d had an issue where we hit the GH API too often using the lifecycle management, and I’ve fixed that. In this case, though, we’re calling `compute` on a dataset. On GH actions, I consistently get a gateway timeout. On my local box there’s no problem. This just started happening recently on some datasets. JW – Are these datasets abnormally large? PB – No DD – Are any datasets succeeding? PB – Yeah, the recent dimer/DES/torsionnet submissions went through just fine. DD – (the #243 dataset records were retrieved locally) I’ve accessed these and they all have result records. Maybe they’ve just never been prioritized in the queue because they’re medium priority? (general) – The #243 dataset should have had its basis and functional split - BP made a fix in the last fractal release that would make sure the first steps ran BEFORE calling the d3 step, but if there was no other step, then the d3 would be run immediately. However it was submitted with an old version of QCF that didn’t do the split/submitting ordering. DD – Ideas for next steps? BP – If you go to recompute that dataset with the new version of QCPortal, then the problem should solve itself. This will be kinda awkward because two different versions of the software will be talking to each other. I’d thought about having the server ask about the client version and restrict access for some versions. (General) – We’ll resubmit the dataset using a new version of QCF/QCPortal, and the new dataset will also go up on QCA. Then, any client with the new versions of everything will “see” the dataset correctly, though a client with an “old” version of QCPortal will not be able to access it correctly. But this would be the case even if we made a new dataset – The problem is rooted in how the client “sees” the dataset, not the dataset itself. dipeptide dataset CC – one holdup was how to avoid deduplication on rotamers for torsiondrives; had to make the dataset manually, but not sure if there’s a better solution? JH – Are you thinking of just the initial deduplication step? CC – Yes. There’s a flag in another step called `assume_unique_molecules`, and if we exposed that in the dataset factory that would help. JH – One hard thing is that there’s a part where inputs are hashed by their fixed-hydrogen inchi key. So if they weren’t deduplicated then there are places where they’d overwrite/clash. So I don’t think this would be totally straightforward. CC – I’ll open an issue to discuss this. imposed electric field in QM - wavefunction not stored for optimizations DD – I met with WWang to get QCF running on UCSD resources. Some additional questions that I haven’t answered yet - Can discuss now. WW – I have things running both on TSCC and locally. Not sure how to access resultrecords. DD – You’ll want to connect to QC server using the client… (I’ll write this up later). If you try to transfer the data by copying the database file you’ll have a bad time. You’ll want to access the results using the API, same as if you were accessing the public QCA. BP, is there a way to store the state of the database on file? BP – The databse has a backup function. Alternatively, postgres in general has a command `pg.base_backup` (or something similar) that does things safely. WW – TSCC really likes rebooting things, so I’m really interested in ways to make safe+transferrable backups. DD – I could work with you on this. Alternatively, PB does a lot of work with this, and he treats his databases as disposable, and just pulls results off of it all the time. WW – I’d like to make a single point dataset using starting geometries and get it running on TSCC. Could we meet after the Thanksgiving holiday? DD – Yes, that works for me! ML datasets for OpenMM multi-molecule issues with QCElemental JH – PE was canonicalizing JW – would these be solved by e.g. `Topology.to_qcschema`? JH – yes, that would make sense, some way to get the mapped smiles for entire system; canonicalize individual molecules separately, instead of all together `Topology.to_qcschema()` and `Topology.canonical_order_atoms` `Topology.to_smiles(mapped=True)` methods would be nice for this JW – are the multi-molecule datasets recoverable? JH – they are all failing in compute, so they are unrecoverable; would need new submissions can try out solutions locally, then can report back to Peter on what changes to make to get these through
		User questions/issues Science support needs new `openff-qcsubmit` release JH – New functionality will mostly around checking inputs, trying to head off issues with multi-molecule inputs. Release ETA next week Infrastructure needs / advances `psi4` on `conda-forge`. MT is still scoping this, no firm commitment or timeline yet.

Participants

Goals

Discussion topics

Action items

Decisions

0 Comments