2024-04-02 alchemiscale : dev group meeting notes

Participants

  • Ian Kenney

  • @Iván Pulido

  • @Matt Thompson

  • @Mike Henry

  • @Richard Gowers

  • @Jeffrey Wagner

  • @David Dotson

Recording: https://drive.google.com/file/d/1u8SPOj-YBRZYikvOv-FGOwct_H7GHrRR/view?usp=sharing

Goals

  • DD : alchemiscale roadmap

    • Q1 : complete “living networks” performance improvements

      • release 0.4.0 imminent; finishing out final PR in review

    • Q1 : Folding@Home compute services deployed in production

      • finish MVP, with integration test suite by 2024.03 2024.04

        • this is delayed; need an additional 2 weeks to finish this out

      • perform FAH tests with volunteers during 2024.04

        • public work server up by 2024.03.15 2024.04.15

        • confidential work server up by 2024.04.01 2024.05.01

    • Q2 : develop Strategy structure, initial implementations

    • Q3 : enable automated Strategy execution by end of Q3, 2024 (2024.10.01)

  • DD : release 0.4.0 today/tomorrow, followed by deployment to alchemiscale.org

    • performance improvements to Task creation, actioning, and claiming by compute services

      • will unblock execution of thousands of actioned Tasks for large AlchemicalNetworks

    • new client methods for getting and setting many network weights at once

    • new client methods for getting tasks statuses and actioned tasks for many networks at once

    • added concept of network state, allowing users to set networks to inactive, deleted, or invalid when no longer relevant

    • also using neo4j 5.x, and the official neo4j Python driver for database communication

    • deployment artifacts will keep underlying stack the same, including perses for now since feflow not expected to work with gufe 0.9.5 and openfe 0.14.0

  • DD : thanks to Jenke Scheen for design of alchemiscale logo

    • image-20240402-154644.png

  • MH : status of openfe + gufe 1.0 testing with alchemiscale main

  • IP: feflow needs

    •  

  • alchemiscale development : new sprint spanning 4/3 - 4/15

    • starting on milestone 0.4.1:

    • architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link

    • coordination board : alchemiscale : Phase 3 - Folding@Home, new features, optimizations, targeted refactors

Discussion topics

Notes

Notes

  • DD : alchemiscale roadmap

    • Q1 : complete “living networks” performance improvements

      • release 0.4.0 imminent; finishing out final PR in review

    • Q1 : Folding@Home compute services deployed in production

      • finish MVP, with integration test suite by 2024.03 2024.04

        • DD – Showed example of local execution 2 weeks ago.

        • this is delayed; need an additional 2 weeks to finish this out

      • perform FAH tests with volunteers during 2024.04

        • DD – Will ask for volunteers on F@H to target our tags for testing

        • public work server up by 2024.03.15 2024.04.15

        • confidential work server up by 2024.04.01 2024.05.01

          • DD – HMO got F@H encryption implemented. Currently being tested with F@H volunteers.

        • JW – Are tasks embarrassingly parallel?

          • DD – Yes, nobody uses extends yet. Though note that people will need to submit calcs in a special way (with F@Hprotocols instead of regular ones) to get things to run on F@H.

    • Q2 : develop Strategy structure, initial implementations

    • Q3 : enable automated Strategy execution by end of Q3, 2024 (2024.10.01)

  • DD : release 0.4.0 today/tomorrow, followed by deployment to alchemiscale.org

    • Alchemiscale PR 257- performance improvements to Task creation, actioning, and claiming by compute services

      • will unblock execution of thousands of actioned Tasks for large AlchemicalNetworks

    • new client methods for getting and setting many network weights at once

    • new client methods for getting tasks statuses and actioned tasks for many networks at once

      • DD – Switching from py2neo to a more mainline python driver, should alleviate some of the status lockup issues.

    • added concept of network state, allowing users to set networks to inactive, deleted, or invalid when no longer relevant

    • deployment artifacts will keep underlying stack the same, including perses for now since feflow not expected to work with gufe 0.9.5 and openfe 0.14.0

      • IP – Side note - feflow should work with those versions. But it’s fine to have perses in there for now since feflow will change a lot in the future.

      • DD – Gotcha. I’ll keep it as just perses for now, then we can update the builds to include feflow later once the changes are done.

  • DD : thanks to Jenke Scheen for design of alchemiscale logo

    • image-20240402-154644.png

  • MH : status of openfe + gufe 1.0 testing with alchemiscale main

    • All jobs not involving charge changes are done and HB is analyzing them. The ones that DO have charge changes are in the pipe. Dealt with some bugs with storage and other stuff. Completion ETA next week.

    • IP – What did we decide on which openff version to use for benchmarks?

      • MH – To be clear - This is a QA check on openfe-1.0.0. This is different from the openff-2.2.0 benchmarks.

      • IP – Do you know which openff version you’re using for this? I’ve been trying to validate the noneqcyclingprotocol with different systems and I’d like to use this as a comparison.

      • MH – I think it was openff-2.1.0? Not positive.

        • RG – Agree, I suspect 2.1.

      • DD – And to recap, the plan is to proceed with the stable GUFE+OpenFE 1.0 release if these benchmarks look good.

      • JW – Note that openff-2.1.0 and 2.1.1 are funcitonally identical - The latter just adds Xe parameters, so if there’s jobs in the pipe with 2.1.1 they’re equivalent to 2.1.0.

  • IP: feflow needs

    • We’ve used noneqcycling for a few toy examples and it seems fine. Ran TYK2 last week and it looked fine. This week I’ll be running other systems.

    • DD – I added “Adding extends support” to our sprint. Is that something you’d like in the next release of feflow?

    • IP – Yes, it’s something we should include in the next release.

    • DD – I recall that this was discussed at ASAP on Thursday, and that this was seen as a good way to save compute when we have pre-emptible resources. Also F@H could take advantage of the same machinery. IK, would you want to take this on?

    • IK – I’m not too familiar with this stuff, so I’ll need to look into it and will let you know.

    • IP – Great, happy to work with IK on this.

    • IP – One thing to highlight about the backend work is that we’re testing a lot of the objects we’re using for hybridtopology and lambdaprotocol that weren’t being tested before, so that’s a huge improvement.

  • alchemiscale development : new sprint spanning 4/3 - 4/15

    • starting on milestone 0.4.1:

    • architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link

    • coordination board : alchemiscale : Phase 3 - Folding@Home, new features, optimizations, targeted refactors



Action items

Decisions