2021-03-02 QCFractal User Meeting notes

Date

Mar 2, 2021

Participants

  • @Jeffrey Wagner

  • @David Dotson

  • @Trevor Gokey

  • @Heejune Park

  • @Hyesu Jang

  • @Pavan Behara

  • @Joshua Horton

Discussion topics

Item

Notes

Item

Notes

New meeting time?

  • DD – Could we move this to 8 AM Friday?

    • (General) – Yes

    • JW moved meeting to 8 AM Friday.

 

BP is giving a talk at a conference at the same time.

Queue status

  • DD – Compute appears to be off on all clusters except PRP.

    • JC reported an issue where his jobs were failing soon after he started them. I need to chat with him about this. I now have access to Lilac for another reason, so I may be able to manage that deployment.

    • TG – I have some jobs running on our old cluster, but need to preserve my compute on the new cluster.

    • DD – I want to make a separate compute tag for non-PEPCONF jobs – Right now everything is expecting high memory nodes.

    • TG – Mobley may be able to get the UCI openforcefield user an account on newer resources, so I’m hoping that comes through soon.

User questions/Open discussion

  • HP –

    • Is there a way to delete a dataset from the server? Want to delete a specification and dataset.

      • DD – Client lets you delete a collection, which doesn’t delete the underlying data, but does delete the grouping/name. QCFractal doesn’t expose a direct way to delete data. I’d recommend against trying to delete the molecules/specifications since there are lots of links can break/get out of sync.

      • DD – Instead I might recommend deleting the entire database

    • I’ve been test-running calculations on my own computer and it’s going well. How would I do this on a cluster? Would I assign managers in the submission script?

      • DD – There are a few options. 1a) You could start up a QCFractal maanger on the head node (or somewhere else publicly accessible), and that server can submit SLURM jobs. This may get you in trouble with the cluster administrators because they don’t like people running stuff on the head node. 1b) you could run a MANAGER as a submitted job, and have that manager submit other managers (This may require a special cluster – In some clusters, worker nodes can’t submit other jobs). 2) You could submit jobs manually, where you submit a bunch of jobs with managers to run in seprarate jobs. These would be running with “adapter=pool”, where they’d start up and then connect to the server to begin getting work.

    • If I start a server locally, I understand that managers that connect to it will do work. But where IS the server?

      • DD – You probably have the server running on your local computer/laptop now. This will be a problem when you’re running workers on the cluster, since the cluster workers won’t be allowed (by the network) to talk to your laptop.

      • DD – The server has a queue of work that it wants to get done. But the server won’t run its own jobs. Work only gets done once a manager is started and connects to the server. The manager has to initiate contact with the server, and then the server will give it a job to do. This is why it’s important for the manager to be able to connect to the server.

      • JW – ls ~/.qca to start finding where the server files are

      • TG – If you hit a “certificate” error, connect with verify=False

        • HP – I already had this problem and did set verify=False

  • HJ – For the openff theory benchmarking set v1.0, we combined a torsiondrive and pre-charged molecule datasets. For some functional group combinations, I expected to see 36 completed jobs.

    • DD – Most resources have been working on PEPCONF (in a high memory queue), so this dataset has kinda stalled.

    • HJ – Could we get workers assigned back to this?

    • DD – Yes, I’ll put workers back in the general queue and assign this dataset high priority.

  • TG – Iodine basis set issue?

    • JW – Swope reported an energy difference of 40 hartrees, which is like 10,000 kcal/mol. I wonder if there’s

    • DD – If scf_type is set to df, that’s where we see these issues. If we set it to pk, direct, or out_of_core, then he sees consistent results.

    • TG – I think the issue is that df isn’t made to handle iodine.

    • PB – There’s a table that’s supposed to answer this, but the cell of interest is empty.

    • PB – I tried a tri-iodide molecule, and the difference between df and the others was like 200 hartree.

    • TG – I think that our QM isn’t valid for anything with iodine right now.

    • DD – Agree. I think our normal workflow doesn’t work for iodine.

    • JW – What’s the difference in computational cost for using a non-direct method?

      • TG – 3-4x for direct.

      • PB – pk took about 11 sec, and df took about 6, direct took 60 sec.

    • PB – There’s another option – df_basis_scf. If I change that to “dzvp”, the energy difference drops to 2 hartree

      • TG – That’s still a lot.

    • JH – Looking through some of the existing calculations, I’m seeing some weird iodine geometries (shows iodine on a benzene ring that gets optimized to be out-of-plane)

    • DD – Do we want to move toward submission practices where we have conditional logic for which basis/method we use? Will we encounter this problem with other high-numbered elements?

      • TG – Should energy be comparable for direct and density-fitting?

        • PB – They should be

      • DD – Thinking about the quality of the final data – Will we ultimately have a dataset where individual jobs have their own methods?

    • TG – Is there iodine in the theory benchmark set?

      • HJ – Checking into this.

      • HJ – For the valence fitting set, I’ve found a few molecules with iodine. I’ll check the QM and MM geometries. TG also recommended that I add more molecules to the theory benchmarking set.

    • JW – How would switching to pk change computational cost? Would it increase exponent (eg. O(N^4)--> O(N^5)?), or just a constant factor?

      • PB – Unsure

      • TG – Probably a constant factor

    • JH - Could consider methods/bases other than dzvp.

      • TG – Agree that we want a better basis set.

      • JH – Daniel Smith had said that dzvp will be really hard to converge because they’re all-electron.

    • DD – Bill Swope said that density fitting would produce different results than the other three. The results from density fitting are consistent, but different than the results of the other three.

    • JW – Do our jobs with the df setting really get the wrong energies for the geometries they reach? Or do they just reach weird geometries, but get the “correct” energy assigned for that geometry?

      • TG – I don’t think absolute energies are comparable across scf_type settings anyway. If anything can be compared, it’s RELATIVE energies within the same scf_type.

    • TG – Is Iodine the biggest element that we use?

      • Yes

  • Next steps

    • We’ll try to get Ben Pritchard or Lee Ping Wang on the QCF submission meeting next Friday.

      • DD – I’ll do this

    • We’ll add this to the FF release agenda this Thursday.

      • TG and HJ will present on this.

Action items

@David Dotson will create additional QM deployment on PRP, targeting only openff compute tag; will pull everything but PEPCONF at the moment
@David Dotson will reduce the existing QM deployment on PRP; targeting openff-himem, openff compute tags (in that order); largely spends its cycles on PEPCONF
@David Dotson will assign high priority to the theory benchmarking torsiondrive set
@David Dotson will reach out to Ben Pritchard, Lee-Ping Wang for participation at 3/12 submission call for iodine discussion; optionally Daniel Smith if willing
@Trevor Gokey and @Hyesu Jang will present problem and findings on FF release call on 3/4; Trevor will pursue findings from reproducing Bill Swope’s workflow if possible

Decisions