Item | Presenter | Notes |
---|
Updates from MolSSI | Ben | BP – if you submit an optimization, the protocols part of a QC spec is ignored BP – do we want torsiondrive protocols? DD – presumably yes, can think of e.g. wanting wavefunction for final gradient of each optimization JH – if we’re calculating the wavefunction for every gradient and only saving the last one, that would allow for using the wavefunction of the last gradient as the SCF guess for the next gradient BP – also can accommodate torsiondrives as a procedure now that it’s in QCEngine
BP – working to make the database more constrained, more normal-form have several torsiondrives that are missing optimizations (about 2 dozen); don’t have a good solution and can’t just drop nulls in have a table that links torsiondrive to its optimizations DD – I think it’s fine if we drop the tables that store the minimum, and instead determine the minimum client side or through a REST endpoint and determine server-side; not much argument for storing this
BP – also have new statuses PB – can we make collections invalid , or just records? BP – thinking of holding a virtual workshop, use it to show off the way new QCArchive works, tentatively in February DD – how quickly is storage filling up due to wavefunction storage? Have pubchem set going and want to ensure we won’t overwhelm capacity over the break. DD – I’ll notify John that we will trim off orbitals and eigenvalues from these submissions, then begin submitting them
|
Compute
|
| |
Task submission slowness | | DD – will make a PR to QCFractal for submission optimizations; corresponding PR to QCSubmit to take advantage BP – restarting database with postgres logging enabled; will see if we’re missing a key index on a table; if so fixing task creation slowness may be an easy fix DD – running submission with tasks now BP – in new version we will query for spec only once per set of tasks submitted; right now a query on the spec hits 5 indexes, each and every task right now time consuming part is _create_task ; considering moving this to occurring when manager requests task, not on client submission DD – that would actually be a pretty great optimization, with no downside as far as I can tell; manager can afford to wait on first call, but in steady-state operation calls for X tasks at a time, so wouldn’t see slowdown really; would save the client an immense amount of time submitting.
|
Walkthrough of current work on QCFractal | Ben | Organized by entity, not by system component. e.g. molecules have everything in same directory, including models, REST route, postgres storage schema, etc.
molecules have a mutable identifier field that can be queried client has separation between get_* and query_* methods getters allow you to get things by id only, in order query allows by field, but order of course not guaranteed, max number of results, can be paginated
tasks hidden from users, attached to records due to 1:1 relationship records keep their series of errors, allowing for in-server error cycling user management can be done from client; using RBAC (role-based access control) as the basis admin, read, monitor, compute, and submit roles
switched from tornado to flask, using java web tokens instead of sending credentials with every request JH – can you query by molecule identifiers? BP – currently can do by id if I can query by e.g. Inchi, then get say all optimizations that use that molecule DD – perhaps we can stack some methods on Molecule itself that lets you do this; might require either subclassing or monkey-patching QCElementals' molecule object.
BP – deduplication is tricky, in particular for the trajectory of an optimization for optimizations, we now create entirely new gradients avoids issue of not being able to submitresubmit calculations that have the same set of hashed attributes, but e.g. a different psi4 version that fixes a bug
|