/
2022-11-15 Protein-ligand benchmarks meeting notes

2022-11-15 Protein-ligand benchmarks meeting notes

Participants

  • Levi Naden

  • @Richard Gowers

  • Jenke Scheen

  • @John Chodera

  • @Jeffrey Wagner

  • @David W.H. Swenson

  • @Iván Pulido

Goals

  • DD : fah-alchemy - current board status : fah-alchemy : Phase 1 - MVP

    • created 0.1.0 milestone, with outstanding issues required for deployment :

    • at 11/15 deadline - requesting extension

    • @David Dotson development effort now focused on FahAlchemyAPIServer, FahAlchemyClient, and FahAlchemyComputeServer

      • Test suite coverage at 74%.

      • Refactored storage layer to operate in terms of ScopedKeys, greatly reducing complexity, improving performance (no unnecessary serialization/deserialization of large objects to interact with storage layer)

      • Added OAuth2 authentication with JWT tokens, concept of User and Compute identities

      • Still working on required functionality for synchronous ComputeService, which can be run on typical compute resources such as HPC to execute ProtocolDAGs

  • DD : help wanted - deployment issues : https://github.com/openforcefield/fah-alchemy/issues?q=is%3Aopen+is%3Aissue+label%3Adeployment

  • IA : protein-ligand-benchmark : blockers and priorities

      • Adding missing thrombin entries and then good to go

      • Waiting on #81

      • Waiting on one of the networks to return (PR is up)

  • IP : Nonequilibrium Cycling Protocol (perses#1066) update:

  • RG : new user story

Discussion topics

Item

Notes

Item

Notes

DD : fah-alchemy - current board status : fah-alchemy : Phase 1 - MVP

  • created 0.1.0 milestone, with outstanding issues required for deployment :

    • DD – Anyone else willing to help me move forward on these?

    • JC – We’ll have a new person, Hugo, coming online in January, who may be able to help. Before then, it’ll probably be best to ask MH when he gets back next week.

    • JW – Can we reduce the scope of any items to reduce the workload?

      • DD – This scope is already designed just to get the MVP.

    • DS – I’m pretty good at CLI stuff, I’ve done stuff with relatively complex CLI infrastructure.

      • DD – I don’t think there’s anything here that needs to be complex. But if you’re happy to help chase down loose ends that would be quite welcome.

      • DS – I’d be happy to meet with you to understand the shape of the question.

  • at 11/15 deadline - requesting extension

    • JW – Nov 29th?

      • JC + RG + JW approve

    • JW – Is that realistic, DD?

      • DD – There are a lot more things popping up than I thought, kinda death by details.

  • @David Dotson development effort now focused on FahAlchemyAPIServer, FahAlchemyClient, and FahAlchemyComputeServer

    • Test suite coverage at 74%.

    • Refactored storage layer to operate in terms of ScopedKeys, greatly reducing complexity, improving performance (no unnecessary serialization/deserialization of large objects to interact with storage layer)

    • Added OAuth2 authentication with JWT tokens, concept of User and Compute identities

    • Still working on required functionality for synchronous ComputeService, which can be run on typical compute resources such as HPC to execute ProtocolDAGs

  • JC – Will first implementation be F@H-directed, or cluster-directed?

    • DD – Cluster-directed, synchronous execution. Then this buys us time to interface with F@H, which is harder, since we have to hit the F@H adaptive sampling API.

  • JC – Best example of loading in tasks would be for JS and I to prepare some inputs and have those go in?

    • DD – Yes (shows some details of input methods)

  • JW – Ballpark estimate for getting F@H implementation going after synchronous/cluster implementation?

    • DD – Depends a lot on where logic lives for submission+prioritization. Also we don’t have protocols that run on F@H, but I’m planning on using the perses noneq cycling protocol as a template, and there are some details that I’ll need to shake out there. I have good confidence that it’s all possible, it’s just hard to come up with an estimate before we start working on it. But it will definitely be easier to work on this once we’re able to run things synchronously and debug the more straightforward problems.

    •  



IA : protein-ligand-benchmark : blockers and priorities

    • Adding missing thrombin entries and then good to go

    • Waiting on #81

    • Waiting on one of the networks to return (PR is up)

  •  

IP : Nonequilibrium Cycling Protocol (perses#1066) update:

  • IP – Small update this week, haven’t had a ton of bandwidth to work on this. I met with RG and IA this week, talked about mappings and how I should consume them. There should be enough info for us to do the API in perses. But we do need to reprot back the changes in the mappings but there isn’t a standard way to do this, and we’ll need to discuss that in the future, since some map atoms may be changed given different conditions.

  • IP – I also met with Mike and David to talk about protocol settings. Now have a better understanding of the objects and how to use them. So hopefully I’ll have time to work on them this week.

  • DD – You mentioned uncertainty on how to report back mapping changes based on protocol?

    • IP – Right, we d on’t have consensus for how to do this. Just reporting it in the log may be sufficient. Maybe we can come back to this once we start standardizing

    • RG – There are no restaints on what can be sent back at the end of the job. You could include the mapping that ended up getting used, but there just isn’t a standard for that yet. So there could be a copy of the entire context of what was simulated, which could be useful for debugging.

    • DS – We do want an actual API point here that’ll be standard across many things.

    • RG – But the description of the simulation that was run is engine-specific

    • DS – But GUFE objects are standardized.

    • RG – But then we’d need to back-convert from engine-specific objects and GUFE objects, which isn’t really planned now.

    • DS – It will be useful for data mining in the future to store this afterwards.

    • RG – IP, we may be able to add this as a decorator to an atom map… So you could do the constraint application ahead of time rather than after. So you could lift that and put it into the atom map… there could even be a force field that only applies constraints.

    • IP – That could make sense.

    • DD – IP, you’re in a position whereyou can make some choiecs yourself. We can make standards later on, but for now you should go ahead with what you feel is right.

  • DD – Is the protocol in a state where it actually executes?

    • IP – Not yet. I wasn’t sure about the representation of certain other pieces of info.

    • DD – Can we free up your time somehow?

    • IP – I’ll be working on it this week but I’m not sure if I can complete it. Will make it a priority this week.

    • DD – Ok, we’ll need this in order to run anything, so it would be great if this came together.

RG – New user story

  • RG – Was talking with Mary Pitman, who is creating a new network planning algorithm (not atom mapping, but deciding which edges to run). LN knows about this a bit. We were thinking of modernizing this into the GUFE model. MP has done some testing but not on the scale of F@H. So it could be useful for MP to run a fully connected network so there’s data to validate the algorithm. I was thinking that this could be useful for many of us.

    • JW – I’d like to ensure that OpenFF Rosemary development+validation jobs get priority over this. But other than that this sounds great.

    • RG -- Sure, we should have plenty of capacity.

 

Action items

Decisions