Page Comparison

Versions Compared

Old Version 1

changes.mady.by.user David Dotson

Saved on Mar 07, 2023

compared with

New Version Current

changes.mady.by.user David Dotson

Saved on Mar 07, 2023

Key

This line was added.
This line was removed.
Formatting was changed.

Participants

Goals

RG : cinnabar refactor merged 🎉
RG : gufe 1.0 milestone
DD : sprint retrospective
- Review Done cards
  - what went well?
  - what didn’t?
  - what do we need to improve our approach?

DD : next sprint begins tomorrow, spans 3/8 - 3/20
- architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link
- alchemiscale 0.1.0 milestone
  - Github link macro
    link https://github.com/openforcefield/alchemiscale/milestone/3
- coordination board : alchemiscale : Phase 1 - MVP
- updates on In Review, In Progress, and Available cards
- create/nominate new cards for inclusion in this sprint
- will post on #free-energy-benchmarking when next sprint is finalized

Discussion topics

...

Item

...

Presenter

...

Notes
RG : cinnabar refactor merged 🎉
CLI interface is the same, internals are different.
slides https://docs.google.com/presentation/d/1UGM3ESHUE_9R37NWMWHjpiElH4PoqiqK_nFqJk6gOTM/edit#slide=id.g1b695d24e42_0_0
Can have multiple edges between nodes
JC – Are these on shared ground? Are multiple edges logical?… So this is always a …
RG – Could add different routes for different grounds… This lets you traverse the graph and accumulate the error
JC – Did you see how diffnet was done? I think this handled error accumulation in a well thought-out way
RG – There were benefits to using a multi-edge graph instead of multi-node. Networkx wouldn’t let you represent the same thing as different nodes.
JC – …
DD – This avoids stuffing the nodes full of multiple values.
JC – So we need to transition from the diffnet solution to a different solution….
RG – Yeah, but I think this is better in the long run. There’s still an API point for `to_legacy_graph` which should keep compatibilkiy
JC – so the (substitution should be straightforward… see 10 minute mark)
(General) – It’d be good to agree on units for energy. Maybe kT to agree with Chodera lab stuff and to force users to provide a temp to ensure they understand their data.
Interoperability?
IA – DS had written a proposal about how to go from gufe to cinnabar
DS – Alchemiscale will need something similar (AS-->cinnabar), could probably reuse some components
DD –
JC – Why doesn’t cinnabar assume gufe? Could have a from_gufe method that has gufe as an optional import
…
DS – gufe doesn’t have a results object.
JC – Will cinnabar model be the gufe results model?
RG – That could work. So instead of a cinnabar.from_gufe, you could have a gufe.from_cinnabar.
IA – We should check with BioSimSpace about that - I think it could disrupt their work.
DD – Can probably resolve this separatesly
DS – …
RG – I’m wondering about the models for recording and propagating errors - There’s different ways that we’ll get errors reports from data, mbar, etc.
JC – The only way to record it is to make an error model, which would be a bit heavy. The 80/20 solution would be to make a lightweight way to record/propagate some standard error
DD – RG, do we want to make a `gufe` 1.0 milestone?
RG – I’ve got a PR open where I start updating code coverage. So I’ll try to squash those. I found that our serialization of RDMols might be problematic, so I need to refactor that. Needs to be a long-term storage solution. So yeah, let’s start a milestone
DS – yeah, I want to add some cleanup notes.
(detailed issue sorting, see recording around 26 minutes)

DD : sprint retrospective
Review Done cards
DD – Alchemiscale 88, 83, 84 – HMO implemented Scope name validation, scope listing, submitting chains of tasks that extend each other.
MH – Alchemiscale 93 – Added cloudwatch logging. Can monitor logs without sshing directly onto hosts.
what went well?
what didn’t?
what do we need to improve our approach?
DD : next sprint begins tomorrow, spans 3/8 - 3/20
architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link
`alchemiscale` 0.1.0 milestone
Github link macro
link https://github.com/openforcefield/alchemiscale/milestone/3
coordination board : alchemiscale : Phase 1 - MVP
updates on In Review, In Progress, and Available cards
DD – Alchemiscale 85+/95 - In review – Design+implement task status lifecycle - Ready for review.
IP – Noneq cycling – Implemented reducing overhead in loops, logging, returning objects in nested dictionary. Coordinate extraction into mdtraj-style selection algebra. I need to iomprove on that, since there are some things that mdtraj does that changes the order of indices. This is a big set of changes, I’ve tried to identify specific spots to focus review.
DD – Cool, do you want glances/reviews from anyone else?
IP – I also asked IZhang.
DD – Great. I’ll use this branch for my tests and will pass back any feedback.
JC – Perses 1128 – I’m looking forward to finishing this soon. This is only scoped to OFF toolkit 0.11 support which should be quick. Then there’s remove the OE dependence which will come later.
IA – PLB 83 – No progress, has there been any discussion? I think this is blocked by perses 1128. If that doesn’t seem likely we can go through lomap though that would be bad.
RG – Could we use older versions of deps to get things running sooner?
IA – It’d be better to wait for perses 1128.
DD – Alchemiscale 98 – Clean up synchronouscomputeservice. This is intended to be the simple-but-not-super-performant-or-scalable platform, which should be handy for inspection and debugging. Initial implementation assumes no GPU awareness and that it can use all visible resources. More advanced implementation (asynccomputeservice, around milestone 0.3) will have some ability to run multiple jobs on multiple gpus
JC – What if a job dies due to a bug? Is there a limit on retries?
DD – It depends on how that manifests. We use the `execute` method in gufe, which will not raise an exeption, but the result object will contains an indication of success or failure. But it’s possible that other execution methods will raise exceptions directly or segfaults or something. In the worse case the lack of a hearbeat from a worker will expire the claim and will give us some indication that the task needs reassignment/other inspection
DS – Re segfaults and other tasks terminating - I’d be happy to review how qcengine/fractal handle those. It was designed with those in mind. I think we decently solved the stability problems for that. I think the code/plans are decently generalizable.
DD – Would love a chat/review in the future.
RG – OFE-exampels 36 –
DD – I could use this
RG – I’ll try to get to this asap, will self-assign
DD – Alchemiscale milestone 0.2 - I’ve got HMO started working on this, a lot of docs and usability improvements.
create/nominate new cards for inclusion in this sprint
will post on #free-energy-benchmarking when next sprint is finalized

Action items

Decisions