2023-08-15 alchemiscale Working Group meeting notes

Participants

  • @David Dotson

  • @David W.H. Swenson

  • Ian Kenney

  • @Irfan Alibay

  • Jenke Scheen

  • @Joshua Horton

  • Levi Naden

  • Meghan Osato

  • @Richard Gowers

  • @Mike Henry

  • @Jeffrey Wagner

  • @James Eastwood

Recording: https://drive.google.com/file/d/1z2jOhC0BraA0JIfT6clJKmFC6dMxOaIT/view?usp=sharing

Goals

  • alchemiscale.org user group

    • user questions / issues / feature requests

    • compute resources status

    • call for new users

    • current stack versions:

      • alchemiscale: 0.1.4

      • gufe: 0.9.1

      • openfe: 0.11.0

      • perses: protocol-neqcyc

  • JH : interfacing with alchemiscale for accelerating drug discovery

    • showcase of work being done for ASAP Discovery

  • IP : Protein-ligand benchmarks working group update

  • RG : beta deployment of alchemiscale?

  • JW : future of openmmforcefields?

  • alchemiscale development : current sprint runs 8/9 - 8/21

    • architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link

    • coordination board : alchemiscale : Phase 2 - User Feedback and Documentation

    • alchemiscale 0.2.0 milestone:

    • updates on In Review, In Progress, and Available cards

  • new discussion items from ASAP roadmap: https://asapdiscovery.notion.site/Computational-Chemistry-Core-alchemiscale-related-roadmap-9052f215437246f9be4a11abd10f6d71

Discussion topics

Notes

Notes

  • alchemiscale.org user group

    • user questions / issues / feature requests

      • DD – MO, did you have CUDA issues?

        • MO – those are getting resolved as I resubmit

        • MO – Also some issue with BACE systems, working with IA to resolve.

    • compute resources status

      • Lots of workers on lilac and PRP, 165 workers currently. Lots of capacity for more runs.

    • call for new users

    • current stack versions:

      • alchemiscale: 0.1.4

      • gufe: 0.9.1

      • openfe: 0.11.0

      • perses: protocol-neqcyc

    • RG – OpenFE planned release for this week - Not sure that it would change anything alchemiscale-adjacent.

      • IA – One of our settings options might go away, since it was doing nothing. It was previously not wired to anything.

      • DD – I think this would break everything previously submitted, if there’s a removed pydantic field.

      • MH + RG – We could make a migration script?

      • DD – We may need to do that for this case. Can we come up with a more rigorous release/break schedule?

      • MH – I don’t think that’s feasible - This is still new infra and it’s going to chagne rapidly.

      • DD – Ok, this is a good motivator for figuring out migration scripts.

      • MH – Could we have workers with different versions of GUFE?

      • DD – That would get complicated really fast… So we should probably use a migration script this time. Just a reminder for users that this isn’t an archival system and they should download data they like.

      • JW – I think we’ll reach a transition at some point where the data gets stable, but we may wipe the DB on that day. Up until that day I think we’re fine to keep slapping migration scripts on the data.

      •  

      •  

  • JH : interfacing with alchemiscale for accelerating drug discovery

    • showcase of work being done for ASAP Discovery

    • (see recording ~16/18 minutes)

    • JW – How is system prep experience?

      • JH – That’s not my job - This workflow begins after the protein is already prepared

    • JS (chat) – a side-note here is that this is a ‘bare-bones’ workflow. We’ll still need to hook it up to our automated workflows (to automatically pull in ligand designs for project support, also push results back to the server so medchemists can view the results).

      • JS – Yeah, we have an OE-based prepping workflow in place. Observationally we do have a lot of protein prep problems but that’s outside the scope of this meeting.

    • MH – Result pulldown seemed kinda slow. DD, batch support for pulling down results?

    • DD – Re: Schematization - Are these schemas dynamically defined? Models will change and keys will update.

      • JH – For some of them yes, but other no. Wrappers around atom mapping engines are hard-coded (I read the docstrings and coded schemas around the API). But for the FE objects I’m reusing the schemas that are already used by the first-party packages

      • MH – Yeah, working on this.

      • RG – Yeah, we’re looking at some automatic interfaces with signature inspection.

      • DS – There are a few options, OpenFE will sync up internally and discuss this week.

    • DD – What pain points did you find, JH?

      • JH – For reproducibility - I was kinda surprised that the atom mapper settings weren’t already schema-fied. Also some of the pydantic models have broken to_json methods.

      • DD – Yeah, I’d seen the to_json problems. DS, can you say more?

      • DS – I think there’s code out there that can serialize our objects - I’ll share that with you. We’re looking at overriding pydantic’s dict methods but we want to avoid the situation where there’s two json formats and our users have a terrible time.

      • JH – I’d had to kinda do this same thing. Would love to get caught up on best practices.

      • LN – Pydantic v2 does batch dumping and serialization. If you want to customize each model if they’re nested… I’m happy to chat with you about this!

      • DS – We kinda have something here, it’s just not the built in serializer…

      • JH – This was great, it was very easy to use.

    • MH – In your serialization pickle/bundle, do you point to an SDF filename, or do you have the molecule stored internally?

      • JH – It’s the string representation of the molecule from OpenFE.

  • IP (absent) : Protein-ligand benchmarks working group update

    • IA – We’re working on this - We’ve evaluated MSDs for everything. The results aren’t amazing - there appears to be a performance degradation. We’re also redoing things with star maps but that appears to be worse. Once we don’t see this degradation we’ll do an 0.3 release. HBaumann is working on different atom mappers/settings. OE atom mapper may work better. Hopefully will have results to show next week.

    • JS – I think OE mapper is LOMAP with maybe some ROCS scoring mixed in. Docs are unclear.

  • RG : beta deployment of alchemiscale? We’re looking at doing a stable release but we have an unstable dep stack (pinning to OMMFFs 0.22, OFFTK 0.14, other things with no API promises) - It’d be nice if there was a less-stable alchemiscale instance that we can use to validate unpinning some of these.

    • DD – With alchemiscale, whatI curerntly do when there’s a new version of alchemiscale/openfe/gufe, I clone the prod instance and S3 bucket, spin up a QA instance and link the cloned info. Then I do a bunch of tests to retrieve old results. It sounds like you need something slightly different?

    • RG – Yeah, we want to make sure that our parameterization hands out… We want some way to make sure the data that comes out is valid.

    • DD – Hmm, we’re running on chodera lab resources for alchemiscale.org, but JC’s not in this meeting, so I’ll talk to him and ask about setting up a second instance.

    • RG – Yeah, I just want to be able to like submit one job to make sure things aren’t broken.

    • DD – MH, could you support this effort?

    • MH – Sure, this would be a good chance to learn how to spin up workers and set things up. I’ll work on getting JC’s blessing to do this.

    •  

  • JW : future of openmmforcefields?

    • RG – we are looking to move towards interchange

    • MH – would like to not have to keep maintaining openmmforcefields, speaking from Chodera side

    • RG – are amber protein ffs going to be possible to do with interchange?

    • MH –

    • JW – In summary, in thinking about our plans after the previous meeting, I realized that that we may be on track to play “who’s gonna support OpenMMForceFields chicken” when Chodera lab wants to drop support, but other folks still need some way to get GAFF parameters (at least for old GAFF versions).

    • (General) – We’ll discuss when JC is in the meting.

  • alchemiscale development : current sprint runs 8/9 - 8/21

    • architecture overview : https://drive.google.com/file/d/1ZA-zuqrhKSlYBEiAIqxwNaHXvgJdlOkT/view?usp=share_link

    • coordination board : alchemiscale : Phase 2 - User Feedback and Documentation

    • alchemiscale 0.2.0 milestone:

    • updates on In Review, In Progress, and Available cards

  • new discussion items from ASAP roadmap: https://asapdiscovery.notion.site/Computational-Chemistry-Core-alchemiscale-related-roadmap-9052f215437246f9be4a11abd10f6d71

 

Action items

Decisions