2023-02-07 Protein-ligand benchmarks meeting notes

Participants

  • @Irfan Alibay

  • @David W.H. Swenson

  • @Iván Pulido

  • Jenke Scheen

  • @John Chodera

  • @Jeffrey Wagner

  • Levi Naden

  • @Mike Henry

Recording: https://drive.google.com/file/d/1uwMTiuSJQHWl68WHza_dstPcstB8z6Kq/view?usp=share_link

Goals

  • DD : sprint retrospective

    • Review Done cards

      • what went well?

      • what didn’t?

      • what do we need to improve our approach?

  • DD : next sprint begins tomorrow, spans 2/8 - 2/20

    • architecture overview : PL Benchmarks on FAH - Architecture v6.drawio

    • alchemiscale 0.1.0 milestone

    • coordination board : alchemiscale : Phase 1 - MVP

    • updates on In Review, In Progress, and Available cards

    • create/nominate new cards for inclusion in this sprint

    • will post on #free-energy-benchmarking when next sprint is finalized

  • RG : ComponentMapping scenarios bake-off

    • Should game through a library of scenarios, including:

      • charge transformations (how are dummy atoms represented in mapping)?

      • protein mutations

      • protein mutations on a dimer, but mutating on one of the monomers

      • transformations on one ligand where multiple ligands present

      • covalent transformation of ligand and protein

      • moving a ligand from one place to another

    • RG : game out on current implementation

    • DD : game out on how this would work with protocols using only mappings, not refering to ChemicalSystem component keys

Discussion topics

Notes

Notes

  • DD : sprint retrospective

    • Review Done cards

      • MH – alchemiscale 52 - Done, can deploy using docker compose. Allows us to deploy on EC2, etc.

        • DD – This is great, nice work

        • JC – Thanks for making this design work well into the future.

      • DD – GUFE 111 – relationships between protocoldagresults. This way protocoldagresults know which protocol they’re extending, should allow for reconstructing the part of all of the entire tree if desired

      • IA – PLB 87 – Thanks JC for the review. We now that PDBs in Protein-ligand-benchmark that are at least mostly PDB compliant. Many of these won’t be read by AMBER but that’s because AMBER isn’t pdb-compliant.

        • what went well?

          • JC –

        • what didn’t?

          • JC – I took too long to review this.

          • IA – Well, this was a really large change, so I don’t expect that it should be easy to review.

          • JW – Unclear ownership of this repo/its dual purpose as a publication artifact and a living repo. I just sort of bull-rushed through the changes because I had admin access, but users may be confused in the future about how to access historical data. Not sure that this is fully respolves/if this would be easier in the future.

        • what do we need to improve our approach?

          • JC – If we had tests that would check the setup on these that would be helpful. Could run AmberTools and OpenMM(ForceFields) on them. This would also check the software for consistency between releases.

          • DD – Right, CI on this doesn’t do a whole lot yet.

          • JC – I think this needed a programmatic way to list all the files.

          • DD – I’ve made PLB

          •  

      • IA – One blocker on my end is PLB 83 – I keep getting star maps instead of minimum spanning graphs and I haven’t found the root cause.

      • DD – alchemiscacle ?? (See recording, ~14 minutes) – This went well

      • DD – Alchemiscale 77 – HMO did a good job on this. Large number of changes but I merged into other active PR and conflicts were surprisingly resolvable

    • What went well? What didn’t? What could we change?

      • IP – Kinda hard when an item has sub-items and they aren’t noted here.

        • DD – Agree. I’m trying to write tasks/issues so that they could be finished in 2 weeks. But I’m still building my intuition here.

      • JW – Are PR reviews being correctly prioritized/being done in a timely way? Are there tangled blockers due to lack of reviews? (we get this a lot at OpenFF)

      •  

      •  

  • DD : next sprint begins tomorrow, spans 2/8 - 2/20

    • architecture overview : PL Benchmarks on FAH - Architecture v6.drawio

      • JW –

      • JC – 4ns per window for noneq cycling. Could squeeze down to 60ps for other types. So could do short times between pre-emption windows. I’d start worrying about startup and shutdown time at that level. So 20 gpu-mins is the least would be helpful.

      • JC – Shared file system? This could reduce startup time.

      • DD – PRP is kubernetes based, shouldn’t have that issue.

      • JC – Great. So this could be helpful for intermediate-scale tasks. OFE, how are you folks doing replica exchange?

      • IA – Many iterations in one job at the moment. So our runtime is around 13 hours at present, but breaking things/handling preemption is on our to-do list. So our longer term plan is for preemptible resource to use … method (recording, 27 mins).

      • JC – There may be clever structural things you can do to handle more pre-emptible resources

      • DD – A given protocooldag goes to a single worker. I don’t know whether PRP gives multi-GPU instances.

      • JC – …

      • DD – …

      • JW – This is perfect. Thanks. I’ll report back next week what I hear at the meeting.

      • JW – 1.5kish GPUs available: https://grafana.nrp-nautilus.io/d/fHSeM5Lmk/k8s-compute-resources-cluster-gpus?orgId=1&refresh=1m

    • alchemiscale 0.1.0 milestone

    • coordination board : alchemiscale : Phase 1 - MVP

    • updates on In Review, In Progress, and Available cards

      • DS – GUFE 110 – Will be merged today/tomorrow.

        • MH – IP’s notebook worked fine with these changes, so I don’t anticipate that this will break anything there. But there were a few cracks and I made some new issues. for example how to handle about different classes of FF.

      • IA – PLB 83 – For some reason when we use OpenFE code to get minimum spanning graphs using perses core we get star maps. It’s possible that this is correct operation, but I’m not sure.

        • DS – Note that it’s possible to have more than one MSG, and the star map may be one of the best choices. Another question is whether the outcome is deterministic. Sometimes we’re seeing different minimum spanning graphs when we run the analysis in different sessions.

        • IP – Did you see this in other systems or just TYK2?

        • IA – We saw it in all of them. Am I correct in thinking that Perses isn’t compatible with OFFTK 0.11?

          • JC – Right, I need to handle this. We were trying to use the Molecule object for our interface to everything, but in 0.11 it stopped supporting molecule fragments, so we’ll have to go back to RDMol.

          • JW – This is a hard constraint from OFFTK’s end. I’ll let you know if I can think of another workaround but for now RDMol is probably best.

      • DD – Alchemiscale 56 – HMO gave me a bunch of things for review… (see recording 38 minutes).openfe-benchmark #7 has some breaks from upstream.

        • IA + DS – Not sure what’s going on here, we had some API breaks recently and it might be that.

        • DS – I think this may just be an import path change/namespace flattening. Try changing from openfe.X import Y to from openfe import Y

      • DD – Alchemiscale 80 – Switching from taskqueue to taskhub. Changes how tasks are actioned, now they’re no longer ordered by position relative to each other, instead they all have an absolute weight. The intended operation of this is to have some ties, and then the next actioned task will be chosen from the top ranked ties.

    • In progress

      • IP – perses 1066 – In progress, have a review session later today to review the tests. Met with OFE folks last week to discuss information flow. Decided that we don’t want to rely on labels, rather GUFE transformation object will have info. I don’t know whether there’s a PR yet, but I’ll need to see that to adapt this protocol implementation to that. So we’ll review the tests today.

        • DD – This is related to last agenda item (“Bake off”). That will be a future PR and shouldn’t block this.

      • (see recording 44-47 minutes)

      • IP – Will we have CLI examples?

        • DD – Alchemiscale 81

          • JC – The most important thing to know is job status/running info. The most important things are launching it, and killing it when it goes out of control.

          • DD – Yeah

    • Available.

      • DD – Alchemiscale 34 – I’ll work with HMO on this

      •  

    • create/nominate new cards for inclusion in this sprint

      • MH – Not planned, but I have a PR where I’m working on getting logs from docker composed stuff on like EC2. (Alchemiscale 82)

      •  

    • will post on #free-energy-benchmarking when next sprint is finalized



  • RG : ComponentMapping scenarios bake-off

    • Should game through a library of scenarios, including:

      • charge transformations (how are dummy atoms represented in mapping)?

      • protein mutations

      • protein mutations on a dimer, but mutating on one of the monomers

      • transformations on one ligand where multiple ligands present

      • covalent transformation of ligand and protein

      • moving a ligand from one place to another

    • RG : game out on current implementation

    • DD : game out on how this would work with protocols using only mappings, not refering to ChemicalSystem component keys

  • DD – On last Fri, a bunch of folks brainstormed about component mappings. Big question was how components map to each other, and how much the map knows.

  • JW – Possibly stupid question, but will mapping be concerned with interpolating parameters? Like, right now we have 4-particle torsions and that seems to cause a lot of headaches. Will it destroy everything if some day we have a 5- or 6-particle force?

    • JC – ALchemical modification of potential needs to know whether it can handle a certain OpenMM force. So each force in the system is checked for appropriateness. This also isn’t important if we have a FF without singularities (soft core nonbondeds), so that may be helpful for OpenFF’s planning

 

Action items

Decisions