2023-09-05 alchemiscale Working Group meeting notes

Participants

  • @David W.H. Swenson

  • @Irfan Alibay

  • Ian Kenney

  • @Iván Pulido

  • @Richard Gowers

  • Levi Naden

  • @David Dotson

  • @John Chodera

  • @Mike Henry

  • Meghan Osato

  • @Jeffrey Wagner

Recording: https://drive.google.com/file/d/1W6jHIam4R8QFh_x1da7DlrooxAXvFRSr/view?usp=drive_link

Goals

  • alchemiscale.org user group

    • user questions / issues / feature requests

    • results to share?

    • compute resources status

    • call for new users

    • current stack versions:

      • alchemiscale: 0.1.4

      • gufe: 0.9.1

      • openfe: 0.11.0

      • perses: protocol-neqcyc

  • DD : gufe#184 - openmmforcefields next release?

  • IP : Protein-ligand benchmarks working group update

  • alchemiscale development : current sprint complete; next sprint spans 9/6 - 9/18

    • request push back of 0.2.0 release by 2 weeks; docs-based PRs complete, now focused on result retrieval and file upload+retrieval from Protocols

    • architecture overview : PL Benchmarks on FAH - Architecture v6.drawio

    • coordination board : alchemiscale : Phase 2 - User Feedback and Documentation

    • alchemiscale 0.2.0 milestone:

    • review Done cards

    • updates on In Review, In Progress, and Available cards

  • new discussion items from ASAP roadmap: ROADMAP: Computational Chemistry Core alchemiscale-related roadmap | Notion

Discussion topics

Notes

Notes

  • alchemiscale.org user group

    • user questions / issues / feature requests

    • results to share?

      • IA – OpenFE is slowly populating OpenFE-benchmark repo (benchmarking mappers, scorers, and networks). This should be a supplement to what we’re doing with Ivan. Largely working on deciding which types of networks we should be using - LOMAP seems better than MSTs and star maps, though the former has more edges so it’s not apples-to-apples.

    • compute resources status

      • DD – We’ve burned through the old task queue, a fair amount of compute available. We’ve computed 23,500 calculations, only 21 running at the moment.

    • call for new users

    • current stack versions:

      • DD – unchanged from last week. I know OpenFE is working with a newer version on a separate instance.

        • IA – We’re merging a few PRs but thinking of a release at the end of this week. We’ll test that on our instance but will want to push that to main isntance soon. Expecting some changes to results

        • DD – When you use the client to pull results, they’re stored as JSON in object store. When you pull them they’re not deserialized from json until they reach the client. So if there’s new stuff in the protocoldagresults then the server won’t have a problem. But the old clients may choke on new results, and new clients may choke on old.

        • IA – This would manifest as KeyErrors…

        • (screen sharing code changes - fe_results branch )

        • IA – Just new methods and some dict entries.

        • DD – Great, we’ll test this on the QA instance.

        • IA – There was a thread on the free-energy-benchmarking slack channel with me and Meghan - I’d love to have folks weigh in on the types of analysis they’d like.

        • https://openfreeenergy.slack.com/archives/C02GG1D331P/p1692953290128049?thread_ts=1692912459.878989&cid=C02GG1D331P

      • alchemiscale: 0.1.4

      • gufe: 0.9.1

      • openfe: 0.11.0

      • perses: protocol-neqcyc

  • DD : gufe#184 - openmmforcefields next release? Could we make the 0.11.3 release now and push remaining changes to 0.11.4?

    • IP – Sure, that’s a good idea. I’ll coordinate with MH to get a release going.

    • JC – Some of the delays were due to AmberTools changing the output of GAFF. So we’ll need to downpin to maintain GAFF behavior. Will that be a problem for anyone?

    • JW – I’d prefer to be using the latest version of everything.

    • (Root problem is that ambertools changed how GAFF parameters are assigned, and ALSO that OpenMMForceFields tries to assign GAFF by an exact version number)

    • Next OpenMMForceFields will downpin AmberTools to resolve this issue.

    • DD – Will this affect OpenFF Toolkit outputs?

    • JW – Maybe, but the whole pipeline is “RDKit makes a random conformer, then AmberTools sqm/antechamber does AM1-Mulliken”, so it’s already a super leaky pipeline that a bunch of external stuff could affect. NAGL should solve this by assigning deterministic partial charges based just on the molecular graph.

    • IA – Definitely want to keep old forms of GAFF available, they’re really important to our partners.

    •  

    •  

    •  

  • IP : Protein-ligand benchmarks working group update

    • (IP shares screen, see recording)

    • IP – Would be helpful to have longer walltimes. Maybe 48 hours.

    • DD – We’re running with a walltime of 7 days on lilac if that’s of interest.

  • IA – Seeing large errorbars between repeats for some systems. Though HIMAP salvages this… My assessment is that these plots aren’t converged.

    • DD – So increasing the density of the network wouldn’t help, because the individual edges are too noisy?

    • IA – Too early to say.

    • JC – I wonder about running edges in two directions, and giving our analysis the option of rejecting simulations.

    • IA – So the question is: If we have some redundancy, should we plan to allow the anlysis to throw out some edges?

    • IP – One of the conclusions from Ken’s simulations (which don’t have redundancy, chosen to have the smallest chemical changes), he still needed 20ns for some edges to converge.

    • JC – But we may need to look closer at those conclusions.

    • DD – This sounds like the concept of Alchemiscale “strategies”, where something watches the state of the whole network and chooses to kick off new simulations.

    • IA + RG – We’d love to do this, but need to talk to board about prioritizing this.

    • IA – Don’t think this is a cinnabar thing…

    • JC – We could consider a few paths - replicate edges until there’s some sort of agreement, or run lots of stuff and remove edges at analysis-time if they look bad, or dynamically make new edges during the run.

    • DD – Could trial this on the ASAP side.

    • JC – Could be handy to have a minimum and maximum time for each edge.

    • DD – This could be something like a strategy for alchemiscale.

    • IA – OpenFE will need to talk to its board about initiating this effort. We’re running with triplicates now so we can simulate the effect of this.

    • JC – We should implement extension first.

    • IA – Yeah, more technical work needed.

    • JC – One way we’d thought about this, which might not be the best way, would be to have the hdf5 sitting around for later analysis. But it may be better to

    • DD – Could make this a recurring agenda item - “implements extends support”, since I agree that would be the first step.

    • JW – I wouldn’t want this working group to rush ahead with a fix for this if OpenFE

    • JC – But current results are pretty bad.

    • IA – I’ve agreed to benchmark nagl. So we want to run larger networks with current PLB. I think we need to brute-force a solution quickly to complete this testing.

    • JC – Seems like it would be good to run more edges now and be prepared to throw some out. Then you could run statistical tests to see whether edges are converged/totally discordant.

    • JW – To be clear, I’d like to make sure that us implementing some sort of extends support doesn’t impose costs/complications on OpenFE’s longer term plans to do something similar.

    •  

    •  

  • alchemiscale development : current sprint complete; next sprint spans 9/6 - 9/18

    • request push back of 0.2.0 release by 2 weeks; docs-based PRs complete, now focused on result retrieval and file upload+retrieval from Protocols

      • (JW + JC + RG approve)

    • architecture overview : PL Benchmarks on FAH - Architecture v6.drawio

    • coordination board : alchemiscale : Phase 2 - User Feedback and Documentation

    • alchemiscale 0.2.0 milestone:

    • review Done cards

    • updates on In Review, In Progress, and Available cards

      • IA – Alchemiscale 174 – I’ll try to get this done tomorrow. Need to make some updates to plotting.

      • IA – PLB 93 - It looks like our previous MSTs were constructed with the scores the wrong way around, so new results are coming in and look good.

      • DD – Alchemiscale 178 - This is looking promising. Some little issues with jupyter, MH could I recruit your help?

        • MH – Sounds good.

      • DD – Alchsmiscale 104 - Big PR, will change a lot of storage. Should be at least prototyped in two weeks.

      • DD – May kick alchamiscale 43 into next release

      • DD – Alchemiscale 134 - Setting configurable upper bounds server-side on tasks. Would be a good first issue.

 

Action items

Decisions