Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Participants

...

Discussion topics

...

Memory leak/zombie CPU bug

Notes

  • ZW –

    Hi, I wonder if there could be cases where bespoke-executor is kind of stuck and then ramp up the memory?
    I have a couple of jobs where I'm at the QC data generation stage.
    It seems that I have generate all the fragments from the CPU usage and there is no psi4 running. However, the openff-bespoke executor run using 8 cores and 242.6GB of RAM. The RAM usage keeps going up until it hits the RAM limit, then it begin to read heavily from the file system.
    This problem is not reproducible but happen quite often that it is quite annoying. Thank you.
    The command that I'm using is

    Code Block
    openff-bespoke executor run --workflow-file no-fragments-workflow.json --directory bespoke-executor-${SLURM_JOB_NAME} --file lig_64.sdf --output lig_64.json --output-force-field lig_64.offxml --n-qc-compute-workers 6 --qc-compute-n-cores 8
    image-20240117-171827.png
    • JH – Maybe try with xTB to see if that isolates it to psi4?

    • JW – We’re updating to QCF 0.50 soon in case that helps

MACE paper

  • DC – Released MACE paper, our contribution to NN parameters.

  • JH – I have a PR to get this into QCengine. Currently has an academic license though. AIMNET is also looking good.

  • DC – Right, industry folks would need to contact that group to get access.

  • ZW – One problem we had with ANI was that it had trouble converging in geometry optimization. is this a problem with MACE?

    • JH – Not a problem with MACE and AIMNET2?

    • ZW – How do you call MACE? We have a custom in-house one that we could plug into bespokefit. What’s your interface?

    • JH – We have an interface in QCEngine. You can provide a path to a model file.

  • JH – Re: idea for a bespokefit successor that includes confs from high-temp MD. Started with CC’s torsiondrives. Looked at transferring SMIRKS patterns from fragments to other fragments.

    • CC – Did you check that these SMIRKS also cover terminal amino acids in a longer peptide?

      • JH – Not yet, I’m concerned about that case.

      • CC – That’d be a good test to run.

    • JW – I’d love to see a writeup of the details of this SMIRKS scheme so I can provide better feedback.

  • DC – Idea is that, if this works, and there’s a good ML energy function, this might remove the need for expensive fragmentation and QM.

  • ZW – Have you had time to investigate why fragmentation worked well in the paper, but not now? The published parameters were different from what I got when I tried the refit.

    • JH – I think there’s a regression somewhere, so I still need to dig into that.

    • DC – I think, once you get this workflow going, this should be similar to what SB and LW did earlier.

  • PB – Torsion benchmarks - Was comparing single points between AIMNET2, MACE, and xTB. Was comparing to reference geometries from CCSD. Seem to be wonky and errors are pretty high. (See plots from recording)

    • DC – I’d argue that benchmarks should be done with optimizations done using the same level of theory as the single points. But there I’d expect errors around 1kcal/mol, not the 10 that you’re seeing.

    • JW – Is this a random set of molecules, or are these the biggest outliers?

    • PB – This is a standard dihedral benchmark set.

    • JH – Did MACE look similar?

    • PB – Yes

    • DC – Where did you get this implementation?

    • PB – (ASE(ASC?), something about Kovacs lab)

    • PB – CCSD single points at the opt geos of MP2.

    • DC – So you’re basically doing the MACE calcs at the MP2 opts?

    • JH – Is this dataset on QCA?

    • PB – Yes, will send info

    • SB – Have you tried minimizing with MACE? Does it go to a different conformer?

    • JH – Are they charged? MACE doesn’t handle charge

    • PB – Some are charged.

    • JH – But AIMNET2 should handle charge, so that doesn’t explain everything.

    • PB – I’ll coordinate with JH on this.

Action items

  •  

Decisions