Updates | | DH: test calculations (non-burn-in) hitting a lot of “Unknown error” cases; may try option (c) and see if this produces different results DD: if there are any persistent error cases after several rounds of error cycling, please share (if you can) the input molecule and we can take it to the psi4 devs to get a better solution in place (at least better error messaging)
JW: worked on deployment last week; making conda package (automation), implementing single-file installer (automation) tricky bug in RDKit; seen references by Greg Landrum in validation, when a mol passes other checks, later on the mol will get written and read in SDF there are cases where there is not full fidelity in write/read; we do this as part of our validation for this reason; write to a StringIO object to avoid hammering filesystem there are some cases that roundtripping StringIO fails, but to a file does not. Is consistently reproducible for a given molecule that fails; not clear what about these molecules is causing this so, switched to using filesystem writes instead of StringIO to avoid the issue for now; will pursue (later) creating an RDKit issue with an example
also found a mangling of the atom indices from QM output working to make parameter coverage output (step 3) be the input for optimizations (step 4) have two places where we use arbitrary RMS cutoffs minimum allowable RMS between generated conformers (step 2): 1.0A deduplication for input conformers (step 1): 0.1A the proposed changes would reduce the number of conformers in a dataset, speeding up the QM stage of the benchmark globally the existing choices give more conformers, and differences between them are perhaps not meaningful at the level of our forcefield [decision] change input conformer deduplication cutoff from 0.1 to 1.0A heavy-atom RMS; change conformer generation cutoff from 1.0A to 1.5A heavy-atom RMS
JH: had a go at the burn-in set; will use to investigate issues with Jeff DD: need to create group codes for each partner working to add set-tag functionality for e.g. marking mistake datasets in a server as defunct need to get test coverage up, docstring coverage of the optimize command tree need to also add export of detailed qcvars (as we get from openff-benchmark optimize execute ) when using a server-based approach; there is a desire for dipole moment data, which is in there, but not currently exported by our export command
|
Burn-in
| David D.
| DD: how do we poke partners to get feedback? JW: we’ll run out of bandwidth if we resort to DMs; best to keep it on the channels DD: will draft a “poke” message for #benchmarks-partners, send to JW,JH,DH as draft DD: will prepare a spreadsheet of all partners, with status indicators like “burn-in complete”, “started production run”, “choice of optimization approach”, “variation notes”
|