Starting to participate | NS – What level of calculations are being used for these molecules? DD – Main challenge for most people is setting up the compute infrastructure for the QM. CH – AbbVie is multi-site, we have some common compute facilities. But other sites may have local compute. We have access to NCSA at U Illinois, and we’ve since been split off to a separate queue. NS – At my center, we have workstations with several GPUs, 40-50 CPUs. But we also have access to NCSA. CH – It’ll be easier to run locally. Working with NCSA will be a bit more complex since we’ll need admins to get involved on software deployment/validation. DD – Are local boxes connected to a queueing system, or would it be SSHing into workstations? CH – SSHing, we don’t have a queueing system for local work stations any more. NS – That’s also the situation at my site. NS – We’ll also need to figure out which dataset we’ll use for calculations. CH – Thinking about the composition of the set, is there chemistry that OpenFF particularly wants? Like more/less nitros? DM – We have a lot of nitros in our public sets. CH – What about ortho-substitutes rings like biphenyl ethers? (General) – We want molecules that are “relevant”, even if they’re really hard for FFs.
CH – To get a view of timing, how long would something like toluene take? DD – It varies a lot by compute resources. DM – We can extrapolate from the results of the burn-in set DD – I’ll send the burn-in set to NS and CH.
NS – Are there molecules to avoid? Molecules with Iodine (issue with QM), silicon and boron (not supported by FF) Things where you don’t want RDKit to generate conformers (tricky macrocycles) – Can also just pass in 10 conformers of such macrocycles, so that RDKit doesn’t generate any
DD – Issues with conda-installing software on NCSA cluster? CH – May not be difficult, I’ve just never tried it DD – We support two installation routes in user-space CH - I should be able to do either one technically, but it’s a security policy thing. DD – 40-50 cores on a local machine is probably insufficient for a large set, so it may be good to start getting a single-file installer reviewed by IT. CH – We could also do it on AWS; that may be easier policy-wise (General) – We’ll get more information on how AWS compute is put together.
DD – There are a few different ways to set up the distributed compute. The best one for AWS distribution may be by setting up a small QCFractal server, and anticipating having short-lived managers.
CH – Sounds good. I’ll work on assessing the feasibility of these options NS – What’s the timeline for this first round of benchmarking? (OpenFF) – We’re not sure what the final deadline will be, but we can let you know as more partners return results. CH + NS – Great. We’ll work on getting this started, and will expect to hear from you about hard deadlines.
CH + NS were added to benchmarks-support slack channel DD added CH + NS to email list.
|