Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Item

Notes

“Should the FF go on edges or nodes?”

2022-03-18 JW | DD : FF on nodes vs. edges, pros and cons

Summary: So, when we talk about “where should the FF go in the graph”, we’re really talking about something bigger: “Where does all the logic that prepares a system for a simulation using a specific FF go in the graph?”. The options are:

  • In the nodes (multiple nodes per “starting point”, each making assumptions for a different simulation engine/protocol):

    • Pro

      • System prep assumptions are extremely specific, makes it easier to compare effect of switching only FF and nothing else

    • Con

      • There will need to be additional information stored somewhere to say which nodes represent the same system, like labels on nodes saying “I’m actually one representation of this starting point” or special edges indicating “these nodes represent the same starting point”

  • In the edges (DD and JW advocate this choice):

    • Pro

      • This truly measures the performance of an entire workflow

    • Con

      • This doesn’t allow someone to isolate the effects of choosing a particular FF, since assumptions about structure prep/conversion are also included in the edge definition

Notes:

  • JW – DD and I had long discussion Friday, full notes linked above

    • key question is “where does all the logic that preps a system for a simulation using a specific FF go in the graph?” It’s not just about the FF, but about how the system is coerced into compatibility with a given FF

    • debated whether a node should have many representations of the same chemical system, or should just have one?

    • want to avoid balkanization of different edges and what nodes they can take

    • must use caution in construction of protocols on edges such that these

  • JC – idea is to come up with minimal object models that allow us to share infrastructure for execution and analysis

    • e.g. consumable by a DiffNet-style analysis is a key result

    • like this direction because we are rapidly converging on a data model that can enable both relative and absolute calculations in the same graph; standard information flows out for edge results, which can be used iteratively for execution and can be used downstream for higher-level adaptivity, operating on the network

    • think we’ve identified where the handoff / separation of concerns should be; can now engage in the hard work of building the definitions

  • JW – will be clearer how edges are structured based on what goes into the nodes; this is the information on which edge protocols operate

    • determines the knobs that can be meaningfully placed on edge protocols

    • I’m largely interested in avoiding the scenario where the the “method signature” of OpenFF workflows and the OpenFE workflows diverge and can’t be reconsolidated

  • IA – are we proposing that system preparation is on the edge?

    • JC – I think what is meant is that the edge does the preparation steps needed on the node information to perform its transformation

    • IA – The sticking point in my head is having the protonation process defined in the edge. I think that computations on different protonation states of a complex should be represented by different NODES, not different EDGES.

    • JC – in the short term that will require that our nodes are flexible enough to specify a specific protonation state of the ligand, protein

      • edge would then encode transformations between those mutants

      • in the future, would want node to represent family of titrateable sites

    • JW – are you thinking different protonation states of the same complex should be different nodes?

    • JC – Yes. Different protonation states of the same complex should be different nodes.

      • a few different ways these are specified in practice for different MD engines, for example

      • so upfront, distinct protomers should be representable with different nodes

      • protonation differences are mutations on proteins, different molecules when looking at protonation differences on ligands

    • DD – How about the following decision: “Until we come up with a different plan, we will plan to put FF information on the edges, and we’ll keep nodes

    • JC – Could do the following:

      • Write out what the object model for data storage and API would look like for the various components that are needed. Assume that nodes will contain molecular structure, and edges will contain info about the requested analysis, which would be sufficient to make the simulation inputs, run them, and perform the final analysis.

      • Approved by JC, RG, JW, DD

    • MH – Agree. It’ll be easier to reason about this once we can see the plan in more detail.

    • DD – Ok, I can do that.

(Verbal intro to next topic)

DD – Looking at diffnet, many idea here look promising. Ability to swap in different executors for different edges

JC – Is the idea to build in an architecture that allows for adaptivity later on?

DD – Yes, adaptivity in terms of how the edges get evaluated. If the nodes are the “what”, the edges are the “how”, the execution strategy is the “how” and “when”“when”, and the executor itself is the “where”

JC – So, basically, the system would have control over how to allocate computational resources. There would be a modular component that makes these allocation decisions.

DD – Yes

JC – Are you intending for this to JUST be for the F@H project, or would this also be used for local clusters or AWS?

DD – The latter

(In-line discussion of the below topic)

MH – Execution parameter strategy input could even be “dollars”

DD – I’d like to know OpenFE’s plans for this area

  • RG – Are you thinking that this method would take the entire network as input? OR just a limited version/summary of it?

    • DD – I was initially thinking “the whole network”, but this could be more limited

    • JC – Yes, making a decision on this point may need to align with OpenFE’s plans/developed code in this area.

  • DD – Basically, when you pass the “execution strategy” into the “executor”, the executor uses the execution strategy to decide which edges to compute next.

  • JC – I’d recommend designing this to allow either synchronous or asynchronous re-prioritization.

  • DD – I know that F@H prioritizes work using “weights”, and this would provide a natural interface to that prioritization system.

  • JC – We could limit the API for the execution strategy, since it probably wouldn’t need to know, for example, all the coordinates of an input in order to make a system. Here, we could make a decision on which info should go into the execution strategy API for planning purposes

    • MH – Agree that we should figure out the API that gets the job done without passing around huge objects.

    • DD – We’ll also want to specify the knobs that the executionstrategy is allowed to turn, like can it request more replicas on an edge, longer dimulations simulations on an edge, etc?

  • RG – What’s the relation between executors and networks?

    • JC – The network is something to be executed, and the executor is the thing that does the computation.

    • RG – So if you wanted to use two compute centers for the same network, you’d have a single executor that talks to both those compute centers?

    • DD – Yeah, one network would be assigned to one executor.

  • JC – We could have a single executor that talks to multiple hosts

    • (some discussion of exactly how this could work)

    • JW – I support this API discussion to the extent that we can spend a little time now to allow various implementations in the future. But I don’t want to spend too much time on this now, or commit to implementing anything more than a F@H executor in the scope of this project.

    • JC – Fireworks or other engines could be slotted in here

    • RG – It’s probably best to spec out the F@H executor separately, and later see how well it extends to cluster executors.

    • RG – My plan had been to have a database of jobs that can be pinged to understand status and get more work. I think this is a fairly canonical “fireworks based model”

    • DD

    • JC – Could have a mix-in interface that has transparency into a database view. Big question is whether that database needs to manage starts and stops, or just provide a view into the data you need.

    • DD – Specifically for F@H, there are special requirements to wrap around that system, which are different from what you’re targeting. Most HPC executors will be quite different from the F@H executor.

  • DD will make an “aspirational API” for the both Executor and ExecutionStrategy components. This must support F@H execution, and may include flexibility to add other executor later.

    • DD + JC + RG + JW approve

Proposed workflow draft, given data model for AlchemicalNetwork

  1. Define AlchemicalNetwork.

  2. Define Microstates as nodes.

  3. Define desired Transformations, with Protocols, between Microstates as edges.

  4. Define ExecutionStrategy with parameters, such as total units of work; the Executor will use this to allocate execution timesteps to each protocol iteratively.

  5. Pass AlchemicalNetwork and ExecutionStrategy to Executor, which could be in-process or via an ExecutorClient to a separate Executor process.

    • Executor will apply ExecutionStrategy to AlchemicalNetwork to determine first set of Transformations to execute on available resources

    • Executor.setup will call Protocol.setup for each Transformation, operating on the Microstates;
      file artifacts for individual simulations corresponding to each Transformation.protocol will be organized specific to the Executor
      (e.g. the FAHExecutor will organize these according to the FAH work server's needs); generates Simulation objects for each Transformation.protocol

      • In other words, Executor has a persistent state store of some form

    • Executor.run will call run method for each Transformation.protocol, which in turn calls run method on each member Simulation; this will execute each in series if run in-process, with single-machine parallelism supported where possible

    • Executor can be run as a separate process in server mode, with interaction via ExecutorClient

      • client submits AlchemicalNetwork and ExecutionStrategy, and Executor dispatches Transformation.protocol.setup and Transformation.protocol.run jobs as appropriate to its available resources

        • for Transformation.protocols that generate many simulations, Simulation.run dispatched as separate jobs

        • need to discuss available Executors from perspective of OpenFE

Action items

  •  David Dotson will draft specific model for AlchemicalNetwork, including schema for nodes and edges; provide an API usage example (e.g. Jupyter notebook)
  •  David Dotson will draft API surface for Executor and ExecutionStrategy; provide an API usage example (e.g. Jupyter notebook)

Decisions