2023-02-14 Protein-ligand benchmarks meeting notes

Participants

@Irfan Alibay
@Mike Henry
@David W.H. Swenson
@Jeffrey Wagner
@Iván Pulido
@Richard Gowers
@David Dotson

Recording https://drive.google.com/file/d/1xDmsnV0dOva5ajh5p60gYmiX-zPozhLh/view?usp=sharing

Goals

DD : current sprint - ends 2/20
- architecture overview : PL Benchmarks on FAH - Architecture v6.drawio
- alchemiscale 0.1.0 milestone
- coordination board : alchemiscale : Phase 1 - MVP
- updates on In Review, In Progress, and Available cards
RG : ComponentMapping scenarios - current conclusions

Discussion topics

Notes

Notes

DD : current sprint - ends 2/20
- architecture overview : PL Benchmarks on FAH - Architecture v6.drawio
- alchemiscale 0.1.0 milestone
- coordination board : alchemiscale : Phase 1 - MVP
- updates on In Review, In Progress, and Available cards
In progress
- IA – PLB 83 – No headway on this. Partly waiting on JC to refactor Perses to work with latest OFFTK.
  - IP – Remind me which fixes are needed in perses? If the maps are provided then we’re ok, but if we need to dynamically generate the maps then we have work to do.
  - RG – We should always be able to feed in maps.
  - IA – While we’ve made a bunch of minimum spanning graphs, they all turned out to be star maps. So we need to resolve that.
  - JC – Is there something we could fix within perses to enable this debugging?
  - IA – Not really
  - JC – This is a high priority …
  - (General) – …
  - JC – I think we can address this.
  - DD – Is there an issue to track this?
  - (Perses 1128)
  - DD – Added to board.
  - JW – OpenFF’s mm_molecule may be able to bridge this gap, but it’s in the private API
  - JC – That’s a worse solution than relaxing the radical check
  - JW – We also wont relax the radical check, that would invite an avalanche of edge cases
  - JC – I’ll just fix this in perses
- IP – Perses 1066 – ….
  - …
  - DD – Do you think we can merge by next tuesday?
  - IP – No. Need a bugfix release before we merge. Though we could start a feature branch where we start merging things before the next big release. What do you think?
  - DD – Depends on perses priorities.
  - JW – I’d say, if you trust your tests a lot, then feel free to move forward with multiple branches. The big moment is that when you merge the dev branch you’ll have a bunch of conflicts, and if you can trust your tests it’s not too hard to resolve conflicts, but if you don’t trust your tests it can be really bad.
  - RG – IP, do you need a gufe release to provide the ligand atom mapping?
    - IP – No, that’s not a blocker.
    - RG – I can just do it anyway. It’s on main, but not in a release.
    - DS – The settings stuff also hasn’t been in a release, and it probably should be in there.
    - MH – I think right now everyone is pip installing.
- DD – Alchemiscale 39 – Waiting on me but I’ll aim to get it in by the end of the sprint. I think I had trouble with openfe-benchmark.
  - IA – That should be fixed, but we just need the gufe release.
  - DD – Do you still use the repo?
  - IA – Yeah, but it’ll need refactoring soon. I’d call it “pseudo public facing”, some industry folks are using it as well. Eventually it’ll just be a wrapper around PLB.
- DD – Alchemiscale 34 – Depends on alchemiscale 85. I may tap MH for some advice on moving parts/reviews. We may also try to make a docker image for openmm workers, and we can use a the GitHub registry for that.
  - MH – Yeah, we may want to sit down and spec out the image. Like, tradeoffs between startup time and size.
  - DD – Yeah, we can probably start with our current worker and derive from that.
  - MH – Yeah, it’s large. I usually do image size optimization at the very end.
  - JW – Range of version
  - IP – On lilac we have a similar issue. Not all GPUs support 11.8, but 11.2 is the happy medium.
  - MH – At the end of the day you could build multipel images and use tags to pick.
  - JW – I was going witht he strategy of building the env in the worker
  - MH – That depends on whether the docker container exposes the correct cuda info. But not everywhere will do that. And it will vary by disk speed. So that’s a tradeoff.
  - DD – I may bug JW+IP+MH for advice this week.
  - IP (chat) – You can also not install (or manually uninstall) the cudatoolkit from conda-forge, as long as you have a local install of it in your HPC nodes. That will make OpenMM not use the librt (JIT compiling) but to actually use the nvcc compiler in the nodes. This might have some overhead (only once), though
- MH – Alchemiscale 82 – We can keep this on the sprint if I drop the loftier goal of getting this to run on ARM (something something). Big blocker is ambertools on M1…
  - DD – So the goal is to do a full deployment on an M1 host?
  - MH – Yes, I’ve reviewed all the deps and AmberTools is the only one.
  - DD – I’d limit the scope of this to just getting cloudwatch running.
  - MH – I’ll close this PR and open a new one just with the cloudwatch stuff.
- DD (on behalf of HMO) – Alchemiscale 85 – Hashed out a design for task lifecycle. Put a diagram of state changes on that issue. (See issue text)
- DD – Alchemiscale 83 – Assigned to HMO.
Available
- …
- DD – I’m gonna try out what a componentmapping would look like with different key designs
- DD – Alchemiscale 89 – I think HMO beat me to this
- …

RG : ComponentMapping scenarios - current conclusions
- RG – We told you what we currently have. Our current status is that you need to manuallylabel parts of the system using strings, whcih is fragile and non tranferrable. Like, if someone called things “host” and “guest” instead of “protein” and “ligand” it wouldn’t work.
- DD – I think the mapping currently just says which components map to other components.
- RG – Oh, then the string label stuff is different.
- DD – Right, the component mappings should… I’ll try using that with IP’s protocol.
- DS – One thing we identified as a liomitation is that, if you used two different small molecule FFs for small molecule components, that would be a problem.
- DD – I thought about that after our call. In the settings object we’d have references to those names. But that’s OK, thy should be different on a per-transformation basis. In the same way that mappings refer to… So I think we should be in the clear. A protocol should only refer to info in the edge, it shouldn’t need to know about the nodes.
- JW – Elaborate?
- DS – If you had two small mokls in a system and you wanted to treat one with OPenFF and one with GAFF, how would we record that? Right now the hierarchy says that all components of the same type get the same FF, so if both small mols were “ligand”, then there’s no way to distinguish that each should get a different FF.
- DD – …

Meetings

2023-02-14 Protein-ligand benchmarks meeting notes

Participants

Goals

Discussion topics

Action items

Decisions