2023-09-12 alchemiscale Working Group meeting notes

Participants

  • @David Dotson

  • @David W.H. Swenson

  • @James Eastwood

  • Jenke Scheen

  • @John Chodera

  • Meghan Osato

  • @Mike Henry

  • @Jeffrey Wagner

  • Levi Naden

  • @Iván Pulido

Recording: https://drive.google.com/file/d/1PLrDVMOHHKAB1UqwvcWDjiIUwXp14J3A/view?usp=drive_link

Goals

  • alchemiscale.org user group

    • user questions / issues / feature requests

    • results to share?

    • compute resources status

    • call for new users

    • current stack versions:

      • alchemiscale: 0.1.4

      • gufe: 0.9.1

      • openfe: 0.11.0

      • perses: protocol-neqcyc

  • DD : gufe#184 - openmmforcefields next release?

  • IP : Protein-ligand benchmarks working group update

  • alchemiscale development : current sprint spans 9/6 - 9/18

    • 0.2.0 release still on target for week of 9/19; now focused on result file storage and retrieval in alchemiscale#104

      • DD : might need to do a working session or two with David Swenson to work through interfacing with gufe#186

    • architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link

    • coordination board : alchemiscale : Phase 2 - User Feedback and Documentation

    • alchemiscale 0.2.0 milestone:

    • review Done cards

    • updates on In Review, In Progress, and Available cards

  • new discussion items from ASAP roadmap: https://asapdiscovery.notion.site/Computational-Chemistry-Core-alchemiscale-related-roadmap-9052f215437246f9be4a11abd10f6d71

Discussion topics

Notes

Notes

  • alchemiscale.org user group

    • user questions / issues / feature requests

      • DD – HBaumann wrote in OpenFE core-devs channel. Was finding more results than she submitted. What can happen is that we don’t have guardrails for servers/managers to reject runs that they didn’t expect. So if a worker misses a few heartbeats but keeps running, its task can be assigned to another worker, but then the server accepts all the returned values.

      • DS – Would be useful to see how widespread this is - is it just happening once or twice, or more?

      • DD – I’ll contact HB to get the instances where she saw this to start debugging. This may not get to the bottom of it but there’s a good chance that it will. Would we want to keep double-run edges?

      • DS – Would prefer to keep all runs that were actually done, no need to throw out unexpected/double-assigned edges.

    • results to share?

    • compute resources status

      • DD – About 100 workers going on NRP. 40 workers on pre-emptible queue on lilac. Will scale with workload.

    • call for new users

    • current stack versions:

      • alchemiscale: 0.1.4

      • gufe: 0.9.1

      • openfe: 0.11.0

        • DD – DS or MH, any updates on new version/migration tools?

        • MH – We’re getting ready to stand up a testing instance and will report back. Building a docker-compose file to generalize testing instances.

        • DD – I’d love if you could share this docker-compose file once it’s done - it would be very handy for other people

      • perses: protocol-neqcyc

  • DD : gufe#184 - openmmforcefields next release?

    • IP – Working on some other changes for espaloma, but those aren’t blocking releasing the current changes. So I can prioritize the current release.

    • JW – That’d be great. Thanks Ivan!

    •  

    •  

PLB working group update

  • alchemiscale development : current sprint spans 9/6 - 9/18

    • 0.2.0 release still on target for week of 9/19; now focused on result file storage and retrieval in alchemiscale#104

      • DD : might need to do a working session or two with David Swenson to work through interfacing with gufe#186

    • architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link

    • coordination board : alchemiscale : Phase 2 - User Feedback and Documentation

    • alchemiscale 0.2.0 milestone:

    • review Done cards

    • updates on In Review, In Progress, and Available cards

      • DD – Alchemiscale #178 - Should allow retrieval of results without constraints on the server version.

        • JC – Which compression are you using? I’d previously found big differences in bzip and zlib…

        • DD – I think we’re doing two levels of compression…. Don’t remember the details. Happy to benchmark.

        • (DD looks at code, discusses. See recording, ~28 minutes)

        • JC – Could have a compression kwarg which can take a string as a compression type, or True to use a default compression type.

        • DD – Good idea.

      • DD – Alchemsicale 104 - Result path conversion and object store upload – Working on this, spoke with Swenson late last week. Will need some dummy components for testing, I’ll talk with Swenson.

        • DS – This looks a lot like stuff that will go into GUFE as well. But I’m out Thursday and Friday this week. I could do Wednesday afternoon though.

      • JC – JS, IP, have you been able to test these out?

        • JS – I’ve been using ASAP-alchemy. Main bottleneck is that not everything in alchemiscale has been implemented in ASAP-alchemy. Like right now I can’t pull down errors because we haven’t build that into ASAP-alchemy.

        • IP – I haven’t ran networks in a while, may do that in the coming days. I can help improve the ASAP-alchemy client too.

        • asapdiscovery #457

          • IP – I’ll take first checklist item on this PR

            • JS – Thanks, we can coordinate on that

          • DD – Want me to start on this as well?

            • JS – Yes please.

      • IP – Perses 1066 – Discussed a few changes with OpenFE. Basically we’re going to move protocols to their own repo. This way we can adapt them to use the same base objects as OpenFE. This would help us not depend on bugfix releases… it makes sense to maintain this as a separate object since this would have deps on openmmtools and not perses. This should simplify dependencies a lot. Package name would be something like perses-protocols or openmm-protocols.

        • JC – This will bring in repex protocol as well.

        • DD – Will this be dependent on openeye?

        • IP – No.

        • DD – That’s great news. This will make all sorts of deployment simpler.

        •  

  • new discussion items from ASAP roadmap: https://asapdiscovery.notion.site/Computational-Chemistry-Core-alchemiscale-related-roadmap-9052f215437246f9be4a11abd10f6d71

  • JS – Had some meetings last week where we thought about how we think about these networks/proejcts. Usually you iterate weekly with medchemists to come up with iterative improvements each week, but this loses information from the earlier networks. But this is something you could ideally use. So the idea is to make a project-wide “living network”….

  • Notes from previous meeting here: https://www.notion.so/asapdiscovery/Computational-Chemistry-Core-alchemiscale-related-roadmap-9052f215437246f9be4a11abd10f6d71

    • JC – This could help with OpenFF by refining protein-ligand dataset benchmarking/improvement.

    • JW – We don’t have resources to contribute to this but would be happy to advise. I do see the benefit.

    • DD – One thing we’d need in alchemiscale would be something faster than JSON serialization. So there are some technical things we’d need to do for this.

    • MH – Some improved serialization coming in OpenFE stack

      • DS – Specifically in the OpenFE repo in the OpenFE stack.

      • DD – Would this be reusable /useful for alchemiscale?

      • DS – there’s a function in OpenFE that’s “get_all_gufe_obj”…

      • DD – Seems promising. Thanks!

      • DS – We’d consider upstreaming this into GUFE if it something that people like/would find useful.

      • DD – JS, do you see this becoming necessary in the short term?

      • JS – I don’t think so. Analysis algorithms are pretty comfortable with really large numbers of edges. There will be some trickiness with quantification of lots results/lomap scores/etc.

      • JC – Could use xray metrics judged by MCS.

      • JS – Would need to look at …

      • JC – I think it would require standardizing the API/metadata for some components. Eg, if you’re connecting to the k closest xray structures, you will need to judge the similarity of xray structures.

      • DD – One thing we’ve been thinking about putting into GUFE/alchemiscale is “queryables”…

      • JC

      • DS – Ligand atom mappings… Things can be stored in RDMol properties.

      • DD – Would be handy to discuss how we want things to be queryable from the alchemisclae client.

    • JW – Does F@H compute still seem likely in 2023?

      • DD – Yes, this still seems to be on track. After 0.2.0 milestone we’ll begin working on 0.3.0 with some major component changes and updates.

 

Action items

Decisions