Skip to end of metadata
Go to start of metadata
Participants
Goals
Discussion topics
Item | Presenter | Notes |
---|
Updates on server migration | BP | |
SPICE 2
| PE, JC
| |
| | |
| | BP – SPICE v2 next steps? PE and Marcus will be setting up some datasets Then we need to figure out how to do QCSubmit part JR – wavefunctions or densities? Could be impractical to store all wavefunctions. We could come up with a strategy to choose some to save, e.g. minima, or high fidelity calculations PE – we were initially going to store wavefunctions in SPICE 1, but would have filled up QCA in 3 days. Storage is definitely the issue here BP – we could store everything but do we want to? How big do you anticipate it being? PE – v1 has ~1.1 million conformations. v2 will have roughly same number of additional confs with 40-50 atoms, about the same size as v1. BP – we estimated 5-6 TB for wavefunctions for v1. JC – could we retrieve parts of the dataset instead of everything? Downloading 5-6 TB could be a lot. Could we split up the dataset? BP – we can certainly store 6 TB. JR – for xtb calculations we can throw away. For some DFT, for anything higher, we should store the density and/or wavefunction. JR – CBS limit, we could do two levels with different basis sets for extrapolation JR – not aware of good work of CBS of densities and wavefunctions JC – will have to do multiple calculations of subsets anyway, might as well? JR – agree JR – the best place to start is to store some subset of DFT wavefunctions, and pick the last and most-optimised structure. Or for a series of conformers, save it for the lowest energy one. I can volunteer to sketch out these heuristics JC – would we want to figure out how to do this on SPICE 1.0 since we already have that? JR – sure JC – are ESPs of interest as well? JR – we should only save wfn and re-compute it. Should be relatively easy to package code that does it easily for users BP – I can sit down and give you a storage quota. New server has ~140 TB space. We can also do an attached storage box, probably for an additional ~200TB. It would be spinning disk.
BP – storing and archiving data permanently is an open question wrt QCFractal JC – general storage and distribution solutions for OpenMM, OpenFF, MolSSI? https://qcarchive.molssi.org/apps/ml_datasets/ LW – OpenFF doesn’t have a formal solution written down yet BP – a good option is to move to a static website generator from JSON blobs on a repository BP – PubChemQC is hosted on a personal sharepoint. We’re still interested in re-formatting these datasets to make them even easier to use, but we’ll leave that to the ML guy at MolSSI. We’re quite interested in PubChemQC
JC – online datasets BP – other datasets BP – want to compute across the periodic table to make a basis set recommender. We can’t necessarily use density fitting since that ruins the dataset Doing some work with tmQM dataset but molecules are quite big Also looking at MOPAC reference dataset
PE – would be good to standardise on some levels of theory BP – compute starved at the moment
|
Action items
- Add Bill Swope to training
-
Decisions
Add Comment