/
2024-08-20 alchemiscale : dev group meeting notes

2024-08-20 alchemiscale : dev group meeting notes

Participants

  • Ian Kenney

  • @Matt Thompson

  • @John Chodera

  • @Iván Pulido

  • @Mike Henry

  • @David Dotson

 

Meeting recording: https://drive.google.com/file/d/1zdYwV_qdsH5n2qjmsu3QGtBaEgG3XL84/view?usp=sharing

Goals

  • alchemiscale 0.5.0 released

    • in production use on alchemiscale.org

  • alchemiscale-fah 0.1.0 release imminent

    • completed live test with FAH volunteers, performed 1900 FAHNonEquilibriumCyclingProtocol-based work units on FAH

    • awaiting changes in openmm-core to allow minimization to be performed there before performing larger-scale test

  • alchemiscale roadmap 2024

    • Q1 : complete “living networks” performance improvements

    • Q1 : Folding@Home compute services deployed in production

      • finish MVP, with integration test suite by 2024.03 2024.06

      • perform FAH tests with volunteers during 2024.04 2024.06 2024.07 2024.08

        • public work server up by 2024.03.15 2024.06.11 2024.07.19 2024.07.31 2024.09.15

        • confidential work server up by 2024.04.01 2024.07.01 2024.07.31 2024.09.30

    • Q2 Q3 : develop Strategy structure, initial implementations

      • currently in design phase

    • Q3 : enable automated Strategy execution by end of Q3, 2024 (2024.10.01) mid Q4, 2024 (2024.11.15)

      • performing design in parallel with Strategy structure above

  • DD : alchemiscale roadmap 2025: possible components

    • additional FAH protocols in alchemiscale-fah

    • parallel execution of ProtocolDAGs on conventional compute, GPU saturation e.g. for feflow.NonEquilibriumCyclingProtocol

    • merging and copying AlchemicalNetworks in alchemiscale server

    • additional Strategy implementations beyond NetBFE

    • compute autoscaling for HPC, Kubernetes clusters

    • support for result file retention and retrieval?

    • others?

  • DD : proposal: reorganize alchemiscale project coordination

    • migrate alchemiscale repo under OpenFreeEnergy Github org

      • question: migrate alchemiscale-fah as well?

    • create alchemiscale channel in OpenFE Slack for developer communication

    • create alchemiscale.org channel for users of that production instance, announcements related to instance issues, deployments

    • use Github Discussions as a hub for user questions that fall outside of issues, are more to do with usage questions

    • host working group under OpenFE org, use infrastructure for tracking meeting notes

      • will share read-only meeting notes link publicly via Github Discussions

  • MH : when are good times to do maintenance windows for deployed instances?

    • Scheduled maintenance window, less than 5 minutes typically

    • Unplanned maintenance plan/policy, for example if we need to patch some CVE

  • IP : instances for alchemiscale compute

    • does it make sense to upgrade feflow frequently as releases come out?

    • latest feflow milestone:

  • DD : proposal: alchemiscale.org website

  • DD : proposal: move deployment envs outside of alchemiscale repo, host in new alchemiscale.org-deployments repo

    • decouples development of alchemiscale from specific stack choices for alchemiscale.org

    • would still publish Docker images, but would be built from alchemiscale.org-deployments repo

  • alchemiscale development : new sprint spanning 8/21 - 9/2

    • now focusing effort on milestones for alchemiscale v0.5.1 and v0.6.0 releases

    • coordination board : alchemiscale : Phase 3 - Folding@Home, new features, optimizations, targeted refactors

Discussion topics

Notes

Notes

  • Roadmap 2025

    • JC:

      • I think just addressing scalability issues as we scale is critical as well.

      • For example, how do things work at the 10K to 100K transformation scale?

      • Analysis bottlenecks?

      • Automated online estimates?

      • Management at scale?

      • Projected costs?

      • Eg can we estimate total compute and money cost for planned networks and keep it updated as execution proceeds?

    • DD:

      • Have thought about dashboards, allows for different views based on different user need (i.e. could help with funding)

    • MH: Might need to think about design so that a bunch of people monitoring on dashboards doesn’t slow the network down

      • JC: Yes, want to lazy-load data and avoid blocking calls. And have periodically-updating analyses

      • DD: Yes, would have two types of dashboards (one more for users, one more for admins)

    • IP: Have network-wide analyses run (like cinnabar)?

      • DD: Yes, could have periodically-run services that do lazy stuff (database updates, analyses, stuff to plan out where in the network to work next)

      • MH: Skeptical that alchemiscale itself needs to do this. Maybe should be the user’s job, but cool that alchemiscale enables all of this. Lots of tools out there to make dashboards, etc. that the user could build without too much difficulty

    • DD: Current idea is that dashboards wouldn’t be part of the core alchemiscale package, but would be a separate-but-related packaged. So something like alchemiscale-dash

    • DD: Support for file retrieval?

    • IP: Result debugging?

    • DD: Move alchemiscale into github.com/openfreeenergy org?

      • Pushing to next week (or two weeks after)

    •  



Action items

Decisions