Problem Statement
OpenFF is in need of a dedicated host for a QCFractal Server of its own for hosting and orchestrating quantum chemical calculations. The MolSSI QCFractal Server (QCArchive) has served this purpose to date, but this arrangement has placed the burden of OpenFF data hosting on MolSSI, and at times has strained the storage and network capacity of the MolSSI instance.
To alleviate this imbalance, OpenFF will provision its own host and deploy QCFractal Server to it for future calculations. This will also give OpenFF operational independence in dataset lifecycle and management, allowing the organization to adjust its usage patterns as needed to meet its needs.
Server Requirements
A recommendation for server specifications for an OpenFF QCFractal Server is as follows:
cpu : 2 x 2.8GHz,16C/32,256M
memory : 16 x 32GB RDIMM, 3200MT/s, Dual Rank 16Gb BASE x8
total memory: 512GB
storage : 12 x 3.84TB SSD SAS ISE Read Intensive 12Gbps 512e 2.5in Hot-Plug, AG Drive (or equivalent)
46TB raw total, or 23TB of usable RAID10 storage
we would target 50% utilization of storage at all times, setting dataset lifecycle / retention policy accordingly
This recommendation draws from Ben Pritchard’s experience in managing the MolSSI QCFractal Server for over two years, as well as from his own testing cycles of the upcoming next
refactor of QCFractal.
A host with the minimum specs noted above is expected to cost $25k - $40k through an institutional VAR.
Network Requirements
The server will need to be internet-accessible via the following ports:
http : 80
https : 443
ssh : 22
The server should be reachable via qcfractal.openforcefield.org
.
Network interface capacity should be 10Gb if possible to avoid inbound/outbound data transfer bottlenecks.
Backup Requirements
We will set up automated backups, full + incremental. These will need to be shipped to a remote resource, though offsite is not critical. An NFS or SMB/CIFS mounted network filesystem would be ideal. We will likely leverage pgBackRest
(https://pgbackrest.org/ ) for backup generation.
Maximum backup storage allocation should at least equal the working space of the production host. If the production host has 23TB of working storage space, then the backup solution should have ~24TB of space. This will allow for some accumulation of incremental backups over time as storage utilization rises.