| |
---|
Updates from MolSSI | BP – Growth in server usage has stopped. We’re at 88% utilization. I could move some things to the spinning disk storage, but I can’t move the wavefunction table. BP – I’m looking at solutions here - Speccing out machine/storage that will handle this better. The base results table is about 1 TB, which seems large. BP – I’m running through a stress test/demo with the new release, which is going well. JW – are you viewing the storage issue as something separate from the next-generation machine? BP – I’m curious about the number of calculations you’re planning to run. I’m seeing about 3 million wavefunctions requested currently, and I think you’re planning on running pubchem which is another million. And I see 30 milliion optimizationprocedures. Also the idea of external records, like stuff from gaussian, will use a lot of space. JW – from fitting people: do you foresee lots of of wavefunction demand? PB – For charge model work, we may see an uptick in wavefunctions requested from Gilson group CC – Agree, Willa will need those for polarizability work. PB – But those won’t be as big as the pubchem sets.
DD – On Friday we did a hard stop of SPICE submissions, and resubmitted them to not have wavefunctions attached. BP – Current total QCA capacity is 5.2 TB SSD and 2ish TB spinning disk. DD – Could provision a lot of SSDs in the next generation. Have a storage rack in the server, basically a box of SSDs. BP – VT has a transportation institute, VTTI, and they’re paying $100k to provision a database, so we could see how they do it. But I’m not familiar with server constructions/management so I’ll either need to set aside time to learn, or work with a vendor/center to get it set up. PB – Can we delete the old pubchem dataset with the new QCF release?
|
New submissions | OpenFF ESP Industry Benchmark Set v1.0 - 56K molecules with wfns, 25 or more heavy atoms 56k molecules DD – The PR in question requests wavefunctions on an optimization, which will get ignored. But I know that he’ll run a single point dataset at the end. BP – My rough estimate is that this would be about 300GB of wavefunctions. That’s about half of our remaining space. DD – We could put the optimization set through, which would buy us some time before calculating wavefunctions. Is there anything I can do to help get the new QCF out so that we can do deletion? BP – I don’t think there’s an easy place where you could jump in. The scary thing will be the data migration, and I don’t know that that will work on the first try. I’ve collected representatives of old data formats and tricky things to test the migration code, but I suspect things like in-progress torsiondrives will be messy.
Modified submissions: SPICE sets v1.2 (Huge thanks to David!!) Pubchem sets are still pretty large to submit using github actions' 6 hour job execution limit DD – I created a PR on QCF that is a small optimization that should enable this submission, and I’ll follow that up with a PR to QCSubmit. That will make submission a lot more efficient. JW – for self-hosted Github Action runner on AWS, if we can keep below $5k for a year, easy ask
|
Throughput status | Openff dipeptides torsiondrives v1.1: 5/5 TD COMPLETE!! Openff dipeptides torsiondrives v2.0: 19/26 TD complete from 1/26 last week SPICE DES370K Single Points Dataset v1.0 - 234K+ ~ 67% complete (small mols) SPICE Dipeptides Single Points Dataset v1.2 - started yesterday but high error rate, 1400 complete and 12K errors - a mix of known PRP issues, and also qcengine: unknown error are we running PRP workers now? Seeing pycpuinfo errors DD – I’ve made a release of QCEngine that should fix this, but it’s dependent on a QCF release that I hope we can put out in the next few days. PB – Should we stop those workers until we update the production env? DD – I don’t think the pycpuinfo issues are a complete blocker - I think the majority of jobs are still completing successfully.
may need full-node/high-mem worker config, 40cores/180GB or similar
|
User questions/issues | |
| |
| |