| | |
---|
PCM-based implicit solvent | Simon | PCM appears to be working on the COH submisson JH: Also first dataset storing wavefunctions/eigenvalues, so another first SB: storage and retrieval working just fine! DD: would be worth showing this off at next show-and-tell; I’ll find out from Jeff
|
Submissions
|
| SB: COH is about 50% complete PB: genentech optimization; working on first submission. Only 20% of dataset would be submitted in this first run is this acceptable? Yes, we’ll proceed with the smaller, 127 molecule (20%) subset for the first submission DD: feel free to reach out to me when desired; we’ll re-roll the PR off of master (DD messed up long-lived branches with squash merges)
PB: protomers/tautomers JH: fewer tasks than there are conformers; due to QCF index not being case sensitive, and some of the SMILES clash when reduced to lowercase Do we have another solution? Do we drop the use of SMILES for the index? TG: For torsiondrives, this is still useful. JH: Still want to be able to group molecules that are just peer conformers
JH: change how we index molecules, just do molecule-0 , conformer-0 , basically avoid SMILES for OptimizationDatasets, Basic DataSets; keep SMILES as index on TorsionDrive s TG: May still run into issues on this with TorsionDrive s, but like this because we tag the driven torsion DD+BP: Could also go with removing the lowercase-casting on indices; would be almost a trivial change, and non-destructive for database access (we’ll pursue this)
DD: PEPCONF We’re getting some user pressure; why is it proceeding slowly? Decide on a rebalancing of priorities for datasets: TG: Many of these molecules will take a lot of memory > 50GiB DD: Perhaps time to scale up all our nodes to a minimum amount of memory for QM jobs Do we know if there are ways to reduce the memory usage of Optimizations? BP: Psi4 can write to disk if needed when memory gets constrained DD: I will reduce the memory offered to the manager to below the constraints given to each worker; may trigger writing to local storage also increase the total memory of each replica to 64GiB Could also scale the CPUs to 32, perhaps even 64
We’ll increase the priority of PEPCONF to high TG: will reduce number of workers deployed, see if this reduces pre-emption frequency
Phenyl Dataset - will start to starve others
|
Strategies for user timelines, expectations | David | JH: I think we can be faster in merging datasets now, especially with STANDARDS coming into place DD: we’re already defaulting to ‘high’ priority for fitting datasets, more discretionary for others JH: Some of the datasets were from PI pressure to get things running; could be re-tagged to ‘low’ priority DD: compute tags are an avenue for controlling flows, but dangerous if we park tasks in a compute tag for which we have no managers
|
Dataset index | Josh | Probably good to merge; can’t find the script used to generate DD: we can merge and manually curate for now, add automation later
|
Error Cycling | David | TG: Restarts of SCF convergence, optimization convergence appear to clear often enough, probably don’t want to exclude these DD: We’ll close for now; can chew on more ways to utilize compute tags for routing, how we want to filter error cycling
|
Enforced C1 symmetry | Josh | |