2023-08-22 alchemiscale Working Group meeting notes

Participants

@David Dotson
Levi Naden
@David W.H. Swenson
Hannah Baumann
@Irfan Alibay
@Iván Pulido
@James Eastwood
Jenke Scheen
@John Chodera
Meghan Osato
@Mike Henry
Richard Gowers
@Jeffrey Wagner

Recording: https://drive.google.com/file/d/1XwRke_nSxpAZPbOpWzwuY0PXBW7Gi1b0/view?usp=sharing

Goals

alchemiscale.org user group
- user questions / issues / feature requests
- results to share?
- compute resources status
- call for new users
- current stack versions:
  - alchemiscale: 0.1.4
  - gufe: 0.9.1
  - openfe: 0.11.0
  - perses: protocol-neqcyc
JW : future of openmmforcefields?
DD : update on multi-gpu execution for single ProtocolDAG
- DS, RG, and DD will work on establishing labels on Protocols and ProtocolUnits communicating to executor its desired capabilities
- DD and DS can work in parallel on alchemiscale and exorcist for making use of these labels to optimally allocate resources on compute, communicates back down to the ProtocolDAG on execution
DD : user guide now up on alchemiscale latest docs on RTD
- User Guide — alchemiscale documentation
IP : Protein-ligand benchmarks working group update
DD: which Github org should FAHComputeService be developed under?
alchemiscale development : current sprint complete; next sprint runs 8/23 - 9/4
- current aim is to complete release 0.2.0 by end of next sprint, including first round of documentation
- architecture overview : PL Benchmarks on FAH - Architecture v6.drawio
- coordination board : alchemiscale : Phase 2 - User Feedback and Documentation
- alchemiscale 0.2.0 milestone:
- review of Complete cards
- updates on In Review, In Progress, and Available cards
new discussion items from ASAP roadmap: ROADMAP: Computational Chemistry Core alchemiscale-related roadmap | Notion

Discussion topics

Notes

Notes
`alchemiscale.org` user group user questions / issues / feature requests IP – Message from Sweta - what’s going on with lilac queues? DD – We can’t run on pre-emptible queues for now. The GPU usage was somehow interactiing with how LSF does job pre-emption and the GPU weren’t getting pre-empted. So FYI to ASAP folks - You’re not running on lilca preemptible right now. JW – Right, OpenFF had to apply for PRP and we used the words “open” in our justification. So if ASAP has a different policy I’d feel weird about putting ASAP jobs in our namespace. But ASAP folks would likely get an allocation if they applied! DD – I’ll ping JW about application process on behalf of ASAP. results to share? IA – I have some, but theyre not pulled up right now. (IA will schedule meeting with IP to discuss recent results) DD – This would be useful for me to collect feedback for future improvements. compute resources status DD – (Impressive stats, see recording ~15 minutes) JC – Do we have metrics/statistics about how long jobs are taking? DD – We do have GUFE recording start times and end times for protocolunits, so the protocoldagresult can be inspected to get runtimes. I can also produce throughput metrics for different resources (tasks over time) JC – We should keep our eyes open for opportunities to add performance measurement toolkin to guide future rounds of optimization … JC – Would it be possible to have a GUFE/alchemiscale object to harvest something like a dict of {GPU:performance_metrics} DS – I think standardizing across multiple protocols will be difficult and not worth the time. You’re free to add that into your protocol. JC – … IP – How are we with getting mid-run info? Openmmtools has a yaml that you can inspect mid-run to see progress DS – GUFE 186 should have some of what we want there. JC – May be a simple stopgap solution - we can add arbitrary dicts into result objects. So we could take the openmmtools yaml and put that into the result dict. DD – Right, that would be the easiest path IP – And this would actually get pulled into the OpenFE work, since they use openmmtools. JC – We can take this offline,b ut it’d be helpful to clarify that there’s a “Statistics hole” where we dump performance info that we can access later. DD – IP or JC, could you open an issue in openfe/openfe to collect thoughts on this. JC – Yes, will do DS – If you’re thinking of collecting info on different steps in the process, then the current steps should work, by the current steps could be made more modular to get more fine-grained info. call for new users current stack versions: `alchemiscale`: 0.1.4 `gufe`: 0.9.1 `openfe`: 0.11.0 `perses`: `protocol-neqcyc` RG – We did an OpenFE release on Friday DS – But no changes essential to alchemiscale, so no need for upgrade DD – Ok, I’ll hold off on upgrades. Are data model changes still coming? IA – We held off on those because we needed a quick release, but plan for the data model changes in the next release. DD – Ok, I’ve opened alchemiscale #168 to discuss migration machinery.
JW : future of `openmmforcefields`? ecosystem dependent on `openmmforcefields` bit of an unwieldy beast; looking for alternate paths for functionality future of `interchange` looks different with `openmmforcefields` existence/nonexistence what is the long-term intention for ommffs? JC – Was initially scoped to support additional ffxmls on a “best effort” basis, intended to be a shim to extend the life of those ffxmls. So it would be hard to replace its GAFF functionality… The big thing we need is for Interhcange to ingest and produce OpenMM system objects. getting rid of openmmforcefields mean we are forsaking things like CHARMM, GAFF, etc. was always painful, and hard to maintain all of these FFs have different assumptions on atom names, connectivity, etc. Amber community not even sure what GAFF means at this point, too; so questionable how much more resource to expend on this support JW – how important to OpenFE is GAFF support? JC – do your funders know which GAFF they want? IA – need to just be able to say what we used previously; almost don’t care which version it is prospectively JC – if all you need is “standard” import and export for OpenMM, is that sufficient? anything else you need it to do? JW – Interchange will be able to do “standard” openmm import for our defintition of “standard” - We can revisit/discuss limitations if we encounter them! JW – don’t generally need new features, but do need new versions of Python supported, CI fixes, etc. JC – OpenMMForceFields will remain maintained (CI green, python support) but won’t plan on major new features/FF additions. Peter Eastman should be responsible for this level of support. Can reiterate this to him.
DD : update on multi-gpu execution for single `ProtocolDAG` DS, RG, and DD will work on establishing labels on `Protocol`s and `ProtocolUnit`s communicating to executor its desired capabilities DS – Agree, additionally - this is written only talking about protocolunit and protocolrequest - There should also be communicaiton in the other directions, where a compute manager tells the server how many resources were actually granted. Additionally, let me know where to put user stories about what this should support, IP – Just to reiterate, protein mutations are costly, so in IZ’s work we ran lambda windows in parallel on multiple GPUs. So that what we’d want to do here. DD – We should open an issue for this. DS – This would make sense in GUFE DD – I’ll open a GUFE issue for this, and will post a broader call for user stories DD and DS can work in parallel on `alchemiscale` and `exorcist` for making use of these labels to optimally allocate resources on compute
DD : user guide now up on `alchemiscale` `latest` docs on RTD. Should look familiar to existing users - it’s the same as the confluence-hosted instructions. It’s incomplete so you’ll find some placeholders for now. User Guide — alchemiscale documentation MH – https://github.com/sphinx-doc/sphinx/issues/11586 There is a very recent bug and that is why the menu isn't working (we do have a few versions built)
IP : Protein-ligand benchmarks working group update. Two weeks ago, we mentioned that we’re adding star/radial maps that HBaumann preferred, and we’ll be running FE calcs with them. Also I’d be uploading KTakaba’s manually-curated networks. KT’s networks are already there so I just need a review. In the review I’d like to check that we get the same result using the OPenFE protocol. IA – That’s on HBaumann’s to-do list IP – Gotcha. I think that’ll be interesting but it’s not a requirement for merge. IA – Agree, this should be ready to approve. DD – Is this contributing to a 0.3.0 release? IP – Yes, this would get us over the line for our minimal release requirements. DD – It’d be good to remove the other items from the 0.3.0 milestone/project board.
DD: which Github org should `FAHComputeService` be developed under? JW – I’ll throw OpenFF’s hat in the ring - We already have alchemiscale and the F@H interface is mostly for our needs. JS – Not sure that ASAP would want this, could be good to see where it goes. IP – OpenFF would make the most sense. Alchemiscale is already there. DD – I’ll proceed with putting this in OpenFF.
`alchemiscale` development : current sprint complete; next sprint runs 8/23 - 9/4 current aim is to complete release 0.2.0 by end of next sprint, including first round of documentation architecture overview : PL Benchmarks on FAH - Architecture v6.drawio coordination board : alchemiscale : Phase 2 - User Feedback and Documentation `alchemiscale` 0.2.0 milestone: DD – Most of this is docs improvements, aiming to get this out in early September. review of Complete cards DD – Alchemiscale 28 - User guide - Done, but let me know if folks have more input and we can keep iterating. updates on In Review, In Progress, and Available cards In review IP – PLB 93 – Waiting on comparison studies by HB, mentioned in meeting notes above. IA – Agree, once we have benchmarks this should be good. IP – Perses 1066 – Noneq cycling – We had a meeting earlier, the idea is we’ll be releasing the 0.10.3 release, and then after that this will get merged into main, and it will be closed that way. DD – Alchemisclae 30 – Docs stuff – Should be ready for review, MH could you take a look? DD – Alchemiscale 132 – Upstreamed proposed solution to GUFE 215 - DS, no rush on this, just wanted to make sure you were aware. DS – Thanks, yeah, I’m keeping track of this. It’s just missing tests. In progress DD – User story reviews - I’m running through ASAP ones right now. I know some of the onging work in ASAP discovery is working on this - in particular JHorton’s work. DD – achemsicale 29 – I’ll be adding a tutorial to alchemiscale docs IP – Does it make sense if I share a notebook with noneq cycling? DD – That’d be great - Please send it over and I’ll sculpt it to fit. IP – ticket: Test noneq protocol against repex protocol – Goal is to compare to targets other than TYK2. Still in progress DD – OpenFF user story review. Will get to this, just doing ASAP first since they’re harder. DD – I’m working on another docs one DD – And API docs are weird, might need to reach out MH – Happy to help if you have issues
new discussion items from ASAP roadmap: ROADMAP: Computational Chemistry Core alchemiscale-related roadmap \| Notion

Action items

@David Dotson will pursue PRP access for ASAP Discovery computation

@David Dotson will create anchor issue on gufe for multi-GPU execution proposal, user stories

2023-08-22 alchemiscale Working Group meeting notes

Participants

Goals

Discussion topics

Action items

Decisions