Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Recording: https://us06web.zoom.us/rec/share/G_OzSEl9cOKOn2daLrUgboWT_hDSqhZRP2p20lOD38nBhWf8upXIYNY11_ZaItAK.lcAV__TEtZxUO9jV
Passcode: vkLhi1$9

JW notes re: Bespokefit

  • Timeline for deprecating BespokeFit 1 if we proceed with 2? Paper came out in 2022, what is a good timeline for support?

  • It will be very expensive to support BespokeFit 1 and 2 at the same time - exposes us to a lot of upstreams

    • BespokeFit 1 unique deps: xtb-python (unsupported), ForceBalance, celery, uvicorn, starlette

    • Bespokefit 2 unique deps: At least MACE engine (though licensing issues, so maybe AIMNET2), smee

  • The number of unique upstreams means that we don’t have pre-existing knowledge of many of the inner workings, so a lot of debugging starts by relearning how the components/architecture works.

  • It’d be ideal if a potential BespokeFit 2 was “simple”, like by using only local compute. Fancy work distribution is dangerous and can cause user disappointment:

    • Complex support burden for folks on different HPC setups (e.g., networking issues)

    • Large scale compute makes things hard to reproduce (e.g., “I encountered this problem after the job ran for 50 hours”)

    • Confusing things can happen with caching/batch submissions, like when reuse of congeneric molecules does/doesn’t happen. Have there been real cases where the benefit is worth the costs?

      • MG: The benefit there is to avoid adding noise to a congeneric series

      • DC: It helps if parameters aren’t changing too much

      • MG: Sure, but if you’re changing the chemistry maybe it should change, and we shouldn’t worry about it. It’s hard to know when it is a series or if it’s not, it’s not well defined.

      • JW: Yes, there isn’t a strong metric and as it is now, a change in presenting the molecules will change whether they are grouped or not.

      • DC: If the science team is involved there could be human input, but you probably don’t want that.

      • JW: One solution could be that you can’t run with more than one molecule. Especially since we expect that the next step would be to run the free energy calculation and that is the expensive part.

      • JH: This all sounds reasonable. Right now it seems like we are the only ones that can use Bespoke fit 1 right now and that’s not ideal. Move away from the torsion drives in general

      • DM: OpenEye has built support for this in Orion, Cresset and others are using it, so we do have a larger base. As to the question of whether we want to deprecate it so soon, we should talk to these stakeholders before making such decisions because they are currently investing into incorporating it.

      • JH: If we can adapt it to run without caching and a single molecule that can solve those problems in a more simple way.

      • JW: This is a cool idea and we want people to do this, but there are product lifecycle issues/questions. Our goal is to enable continuing development, publicize metrics of Bespoke 2 roll out and keeping our stakeholders apprised, but related to that, we need a plan for the lifecycle of bespoke fit 1. If bespoke fit 1 lived for 5 years, we can budget it, but the opportunity cost with our implementation of SMEE would exist.

      • LW: Do we have any obligations for bespoke fit 1?

      • JW: I don’t think so. Danny / Josh do you have opinions on whether we can give bespoke fit 1 a 5 year expiration date?

      • DC: I don’t think I have a problem with it, contingent on there being no issues with the scientific validity of bespoke fit 2. Continuing the development of bespoke fit 2 is on Finlay’s priority list, but not at the top.

      • JE: So until Finlay picks it up, no one is working on Bespoke fit 2?

      • DC: Correct.

      • JW: I’m not saying people can’t work on/be in meetings about it, but if the OpenFF name goes on it, that’s a larger conversation.

      • DM: We should bring this to the Advisory board, some of our stakeholders might push back on that.

      • LW: Are we going to deprecate the distributed compute.

      • DC: Josh has brought up similar thoughts, but no one has been assigned

      • JE: Is that a reasonable thing to do?

      • FC: I don’t have a lot of time, but I’m here to help.

      • JW: I would be cautious about deprecating and disappointing users. We should avoid that.

      • JE: The methods used in bespoke fit changes the compute requirements, do our stakeholders rely on the distributed compute?

      • JW: If people have it implemented we don’t want to take it out.

      • JE: The docs have been updated to strongly recommend AIMNET, and that we don’t support calculations that are beyond a local run. Maybe the docs can be updated to more strongly stress that we don’t support issues from distributed compute

      • JW: I think we do that, but getting back to the topic of expiration date, we need to set one now. Once that is done, we can talk about what bespoke fit 2 looks like and work on it.

      • JE: If we are working toward implementing SMEE instead of FB, we can’t keep supporting something that relies on FB, so it sounds like the timeline is dependent on what our end date is for FB… but we don’t have that expiration date yet.

      • JW: Great point, I still think we should start warning them early not when we are ready.

  • DM – Running high temp MD reminds me of TG’s NEB work. TG, please share that with DC when possible.

...