2023-10-03 alchemiscale Working Group meeting notes

Participants

  • @David Dotson

  • Jenke Scheen

  • David Swenson

  • @Irfan Alibay

  • @James Eastwood

  • @John Chodera

  • Levi Naden

  • @Mike Henry

  • @Jeffrey Wagner

Recording: https://drive.google.com/file/d/1egYSFSNJqcoB-LZl8S4r9Eg59qaBf7AY/view?usp=sharing

Goals

 

  • alchemiscale.org user group

    • user questions / issues / feature requests

      • DD : hif2a issues resolved?

      • JS : residue protonation issues

      • JS : retrieving final frame of a simulation?

      • JS : auto-restarting certain types of errored edges

    • compute resources status

    • current stack versions:

      • alchemiscale: 0.2.0

      • gufe: 0.9.4

      • openfe: 0.13.0

      • perses: protocol-neqcyc

  • DD : gufe#184 - openmmforcefields next release?

  • IP :

    • Protein-ligand benchmarks working group update

    • Protocols migration to

  • alchemiscale development : current sprint spans ends 10/9

    • architecture overview : PL Benchmarks on FAH - Architecture v6.drawio

    • coordination board : alchemiscale : Phase 3 - Folding@Home, new features, optimizations, targeted refactors

      • call for volunteers for available issues

    • alchemiscale 0.3.0 milestone:

    • alchemiscale-fah: 0.1.0 milestone

    • updates on In Review, In Progress, and Available cards

  • new discussion items from ASAP roadmap: ROADMAP: Computational Chemistry Core alchemiscale-related roadmap | Notion

Discussion topics

Notes

Notes

  • alchemiscale.org user group

    • user questions / issues / feature requests

      • DD : hif2a issues resolved?

        • IA – My understanding is that MO and HB are trying out the monomer form and will see what the outcomes are.

      • DD: JS - residue protonation issues update?

        • JS – Unfortunately I can’t pull down the error due to some technical stuff. But I don’t think this is an alchemiscale issue, I think I just prepped the protein incorrectly.

      • JS : retrieving final frame of a simulation? We have calcs where something is going wrong, but we’re seeing that predictions vs. experiment is consistently off. We’d like to be able to look at the trajectory to do some checks by eye. We understand that getting the whole traj would be prohibitive, but getting the last frame might help

        • JC – Maybe we could offer a way to rerun the sim locally, and keep the files for inspection?

        • DD – We already support rerunning locally. User Guide — alchemiscale documentation This should also keep the files, or you may need to pass an arg/flag to have ti keep them.

          • DS – We’re using this pathway to do our debugging.

        • JS – Would still like a way to get the info without rerunning. Using OpenFE hybridtopologyprotocol.

        • DD – DS, what are OpenFE’s thoughts on this?

          • DS – This would be a protocol-specific limitation. We don’t have time to do this in the near future.

          • IA – This depends on how much we want to stuff into the return value from these calcs. By default the protocol will make a PDB topology. So we’d recommend generating the topology locally by “dry running” the protocol, then the final coordinates could be space-efficiently downloaded as a numpy array. This would be an extra thing we’d need to add to our protocol, so we’d want to spec this out before we start gluing things on.

          • JS – Is this an OpenFE feature request then?

          • IA – For this protocol, yes

          • DD – We’re working on putting extends support (which would make the final-frame-getting possible) in noneqcycling protocol. But that would be a different protocol than you’d asked about.

          • JS – I’ll discuss with ASAP team. There’s value in being able to pull down the final frame of each replica.

          • IA – This may be interesting - final frame pulldown could be useful - but it adds a lot of size to our return values. Could be an extra setting that’s just used for alchemiscale.

          • JS – I’ll open a PR on openFE repo.

          • JW – How much would this affect storage size/requirements?

            • DD – hard to estimate

      • JS : auto-restarting certain types of errored edges - Basically, I’m seeing edges error stochastically and we get a simulation NaN error. Whether that’s a quirk with hardware or software is unknown, and rerunning usually resolves it. So I’d like there to be an automated way to have the server retry once or twice

        • DD – There are a few layers to this - It’d make sense to make an error cycling spec, but this isn’t something we’ve discussed yet. We also have a parameter for ExecuteDAG to do retries… in gufe/protocols/protocoldag.py there’s an option for n_retries, which I’m leaving as the default (0) but I could crank up

        • DS – We’ve seen NaN errors but setting n_retries to 3 smooths those over

    • compute resources status

      • DD – Plenty of capacity available.

      • IA – I’ve got lots of jobs to submit.

      • MH – I can spin up OSG as well, just let me know when it’s needed and which scope to set it to.

      • DD – I wonder if I can give you a quick script to pull down scope status and automatically kick off jobs.

      • MH – I’d previously tried this but the status for “waiting” had a bunch of root causes and it wasn’t reliable for knowing how many workers to spin up.

        • DD – Alchemiscale 173 could do this - do you have bandwidth to action this?

        • MH – I can’t commit right now.

        • DD – I’ll talk to Josh Horton and Hugo.

        • JS – I’ll have JH start joining these meetings.

        • DD – MH, could you open an issue to help guide how we want to do status reporting/dashboarding?

        • MH – Yes, and I think it’ll be important to describe the API underlying this.

    • current stack versions:

      • alchemiscale: 0.2.0

      • gufe: 0.9.4

      • openfe: 0.13.0

      • perses: protocol-neqcyc

  • DD : gufe#184 - openmmforcefields next release?

    • (IP and JC weren’t in attendance at this time, MH will try to look into this)

  • IP :

    • Protein-ligand benchmarks working group update

    • DD – I’m working on migrating non-OE-dependent perses protocols to

    • IA – I’ll reach out to IP to sync on PLB - just getting back to work after conferences/time off

  • alchemiscale development : current sprint spans ends 10/9

    • architecture overview : PL Benchmarks on FAH - Architecture v6.drawio

    • coordination board : alchemiscale : Phase 3 - Folding@Home, new features, optimizations, targeted refactors

      • call for volunteers for available issues

      • DD – Alchemiscale 156, 173, two other issues - JS is this in line with what you want JHorton working on?

        • JS – Kinda. I can justify having him implement stuff that’s directly useful to us.

        • DD – Yeah, I assume that he can work on stuff and upstream the parts that are of common value

    • alchemiscale 0.3.0 milestone:

      • DD – The milestone is pretty big, we may make the 0.3 release with a subset of the features here and add more in minor releases

    • alchemiscale-fah: 0.1.0 milestone

      • DD – I’ll start updating on this - IZ may be able to help with #6 (times square sampling)

        • JW – Since JC often has to drop early, I’d suggest we move this part up to make sure that he’s here when we coordinate work with his lab.

        • DD: I’ll move the development update to after the user group for future meetings

    • updates on In Review, In Progress, and Available cards

  • MH – Alchemiscale is now on C-F!

    • (General jubilation)

Action items

Decisions