OpenFF QCFractal Server Deployment

Problem Statement

OpenFF is in need of a dedicated host for a QCFractal Server of its own for hosting and orchestrating quantum chemical calculations. The MolSSI QCFractal Server (QCArchive) has served this purpose to date, but this arrangement has placed the burden of OpenFF data hosting on MolSSI, and at times has strained the storage and network capacity of the MolSSI instance.

To alleviate this imbalance, OpenFF will provision its own host and deploy QCFractal Server to it for future calculations. This will also give OpenFF operational independence in dataset lifecycle and management, allowing the organization to adjust its usage patterns as required.

Server Requirements

A recommendation for server specifications for an OpenFF QCFractal Server is as follows:

  • cpu : 2 x 2.8GHz,16C/32,256M

  • memory : 16 x 32GB RDIMM, 3200MT/s, Dual Rank 16Gb BASE x8

    • total memory: 512GB

  • storage : 12 x 3.84TB SSD SAS ISE Read Intensive 12Gbps 512e 2.5in Hot-Plug, AG Drive (or equivalent)

    • 46TB raw total, or 23TB of usable RAID10 storage

    • we would target 50% utilization of storage at all times, setting dataset lifecycle / retention policy accordingly

This recommendation draws from Ben Pritchard’s experience in managing the MolSSI QCFractal Server for over two years, as well as from his own testing cycles of the upcoming next refactor of QCFractal.

A host with the minimum specs noted above is expected to cost $25k - $40k through an institutional VAR.

Network Requirements

The server will need to be internet-accessible via the following ports:

  • http : 80

  • https : 443

  • ssh : 22

The server should be reachable via qcfractal.openforcefield.org.

Network interface capacity should be 10Gb if possible to avoid inbound/outbound data transfer bottlenecks.

Backup Requirements

We will set up automated backups, full + incremental. These will need to be shipped to a remote resource, though offsite is not critical. An NFS or SMB/CIFS mounted network filesystem would be ideal. We will likely leverage pgBackRest (https://pgbackrest.org/ ) for backup generation.

Maximum backup storage allocation should at least equal the working space of the production host. If the production host has 23TB of working storage space, then the backup solution should have ~24TB of space. This will allow for some accumulation of incremental backups over time as storage utilization rises.

Dell quote

Here’s a Dell quote for a first pass version of this (perhaps overkill, configured very quickly) which could be used for proposals. (Perhaps remove first several pages which have format problems).