Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 2
Next »
Participants
Goals
DD : alchemiscale
roadmap
Q1 : complete “living networks” performance improvements
Q1 : Folding@Home compute services deployed in production
finish MVP, with integration test suite by 2024.03 2024.05
perform FAH tests with volunteers during 2024.04 2024.05
Q2 : develop Strategy
structure, initial implementations
Q3 : enable automated Strategy
execution by end of Q3, 2024 (2024.10.01)
IP: feflow
needs
DD : technical questions on SQM charging, avoiding and restricting core count
alchemiscale
development : new sprint spanning 5/1 - 5/10
JW – I’d love to have some chart of throughput, to make a statement in our annual talk like “based on the openff-2.2.0 vs. 2.1.0 benchmarking, here’s how many FFs we can benchmark in a month” (though I understand if it’s not that simple)
Discussion topics
Notes |
---|
DD : alchemiscale roadmap Q1 : complete “living networks” performance improvements
Q1 : Folding@Home compute services deployed in production finish MVP, with integration test suite by 2024.03 2024.05 this is delayed; need an additional 2 weeks to finish this out JW – The demo a few weeks ago where there was data going to a work server and being delegated to a local worker - Were those test messages or actual simulation data? I spoke to the ad board the other day and said that the major complexity right now is testing. DD – Actual simulation data. Right now we have a way to delegate a work unit to a fah work host and get it converted into a F@H native work unit… (see recording ~10 mins)… But yeah, right now we have a big mocked work server for testing that has realistic lag times and other things to ensure that our execution works in realistic contexts. There are also some changes to alchemiscale itself that are being made to enable compatibility with F@H. And IA, could you take a look at feflow 451 (IP just went off on vacation)? IA – Can do. MH – This will just increase the disk space a little? Looks like it’s just dumping the system state to xml? DD – Yeah, we need to, eg, send over the simulation with velocities JC – Used to validate the CPU. We send over this info and then check it against the CPU and GPU to ensure the volunteers aren’t cheating. IA – Glancing at this, the PR looks good and I can merge it soon DD – The PR is still WIP but I’ll turn that over ASAP.
DD – So I’ll be working with IK to work on debugging communication between services.
perform FAH tests with volunteers during 2024.04 2024.05
Q2 : develop Strategy structure, initial implementations Q3 : enable automated Strategy execution by end of Q3, 2024 (2024.10.01)
IP: feflow needs
(Skipped since IP is offline) DD – There are 3 feflow PRs on the board. IA, could I IA – My question for IP if he were here would be to ask whether feflow 38 is blocking the sims he wants to have run for the annual meeting. DD, what are your needs? DD – If possible for the annual meeting we want to have alchemiscale 0.4.1 which uses OPenFE and GUFE 1.0, have feflow 0.1, and drop perses. IA – Target date for 0.4.1?? DD – Before I travel, hopefully (before 5/10). It’s also ok if we fall short of this - We’ll also be doing deployment tests of gufe and openfe 1.0 before then, but we’ll need feflow 0.1. IA – I need to work out what I’ll do for the openfe demo, and I want to have some calcs submitted that I can pull down for that. DD – Open PR for enabling extensions for noneqcyc prototocols (feflow 44) - IK any update? IK – Waiting on OpenFE and GUFE 1.0 releases DD – Understood. This isn’t super critical, just want to make sure we don’t lose track of it.
DD – The F@H support CAN exist without an feflow release
DD : technical questions on SQM charging, avoiding and restricting core count. The test suite in alchemiscale-fah is being slowed down by charge assignment. If charges aren’t already specified on the small molecule components they’ll be sent to sqm. For CI we don’t really care about charge quality, so I’m using “formal_charges” charge assignment. But I’m still seeing sqm running. IA – If you have partial charges assigned ahead of time, it should skip sqm. I’m looking at the feflow logic now. At a glance I can’t tell whether some charges being exactly 0 would skip things. I’d recommend defining the charges ahead of time in the SDF file. JC – It looks like OMMFFs might not override charge assignment for OpenFF charge assignment if user charges are present.
IA – I’m pretty sure that we tested this exact thing at OpenFE (some rooting around different code paths to check this out) JC – I think that adding print statements all throughout the call stack would help here to figure out what’s going on JW – I recommend trying gasteiger first IA – Could also try using nagl, unless OpenFF objects MH – Then I think we should add espaloma-charge and nagl to docker image DD – Could someone drop me a tip here? MH – I’ll open a PR. There’s some complexity with py311 and DGL missing some upstreams on mac. JC – Can we talk about nagl deployment issues at some point? Could export models to other formats to avoid need for DGL. MH – … (see recording ~50 mins)
JW (chat) – Also I see you wrote in the agenda about restricting core count to make sqm happy - We’ve also found that things get slow if there are 4+ cores available to sqm. This can be controlled by setting the OMP_NUM_THREADS env variable.
JC – OE charging should be much faster. MT – charge_from_molecules IA – IMHO the best approach here is to have your SDF files have the partial charges already.
alchemiscale development : new sprint spanning 5/1 - 5/10
JW – I’d love to have some chart of throughput, to make a statement in our annual talk like “based on the openff-2.2.0 vs. 2.1.0 benchmarking, here’s how many FFs we can benchmark in a month” (though I understand if it’s not that simple) DD – It’s not that simple, not all edges take the same amount of time for a variety of reasons, and there are other issues. MH – I could tell you how many hours it takes on a single GPU IA – One of the other useful things to talk about is that the OpenFE+OpenFF lines have been JW – Could I say “excluding ligands with charge changes, we can get through our all the targets in our protein ligand benchmark set in under two weeks” IA – And we could do the five targets for MO is about 3 days going full tilt. DD – And we’ve been able to get 200+ GPUs at a time on NRP.
|
|
Action items
- David Dotson will get Jeff source throughput data, notebook snippet for plot
Decisions
Add Comment