Page Comparison

...

DD : protein-ligand-benchmark issue divide-and-conquer for milestone 0.3.0
DD : any value in Transformation distinct from Protocol? Better to remove an abstraction layer?
RG : ResultsNetwork - should individual transformations encode their own TransformationResult with some required attributes/fields?
JC: Defining API points for free energy submission/execution
IA (if time allows - low priority) - towards a defined means of storing historical free energy data
- Post discussions with D. Hahn, being able to store the existing OpenFF benchmark data somewhere in an easily accessible manner would be very useful for all involved.

Discussion topics

Item	Presenter	Notes

Action items

Notes
`protein-ligand-benchmark` issue divide-and-conquer for milestone 0.3.0	DD – IA and others have been identifying items to clean up the protein-ligand-benchmark repo DD – #30 assigned to JC, will delegate to Melissa DD – Three others are delegated to Lorenzo RG – I’m coassigning myself on #24 DD – #31 delegated to JC DD – Units migration delegated to IA. DD – LiveComms citation and docs delegated to MH JC – Migrate to github pages? MH – I’m in favor of RTD - already exists for this repo, simpler permissions, etc DD – Jnk1 charges? RG – I think this had funny JC – I think we should toss jnk1 since it doesn’t emet our quality standards - “Not trustworthy” xray data MH – I believe we are planning to delete this in #31. RG – May still be worth reaching out to DHahn, if he’s been using this. It may be that we need to update this with what he’s been using (for the record) and then deprecate it. JC – I think it fails multiple inclusion criteria - It doesn’t meet the xray standards OR the dynamic range standards. So I’d be in favor of directly deprecating. If there are better jnk1 structures we could use those. RTD on PR builds MH – I can handle this, if you give me permissions JW – I’ve reached out to DHahn to get us permissions (some others, recorded on GH issue tracker) JW will disable LGTM DD – Adding citation.cff IP – I’ll do this IP – Have you discussed separating API from the data itself? This repo has BOTH data AND the API to access it. It’s kinda a pain that you have to download the whole dataset to just get the reading code. DD – We’d discussed that a bit earlier. Like, scikit-learn does kinda lazy-downloading this with example/test data. DD – JW – Who are reviewers and mergers? DD – Anyone can review. Only one review is required RG – DH opened #36 last week to recommend multiple reviewers. MH – It’d be cool to have policies and pull request templates - Like the “add a new target” template could have a checklist with the steps required to get it merged. RG – Should that be a target for the 0.3 milestone? DD – Nah, let’s plan it for the 0.4 milestone
any value in `Transformation` distinct from `Protocol`? Better to remove an abstraction layer?	DD – I’m working on GUFE #13 - Trying to get it ready for review. I’d love to have a working session with someone from OpenFE to move this forward. RG – Yeah, I’d be happy to work with you this week. RG and DD will work to push this PR forward later this week DD – Possible related to the “submission/execution API” point below - I’ve been thinking about how to do this. Up until now, I’ve been thinking about transformations as having their own protocol. But that may not be necessary to do a… Is there a useful distinction between a transformation and a protocol? RG – The transformation object is the mapping object in this case, right? JC – If you’re computing the FE between two species… There are a few consideraitons. You can do replicaa exchange, lamda switching, etc. But then there’s ALSO the possibility of different atom mappings. So we should answer what we’re looking at here - Just free energies of two species? What if we switch the protein? What if we switch the solvent? Different atom mapping will also lead to different efficencies. Then there’s also the question of different force fields. DD – Along those lines, my question is whether a transformation should HAVE a protocol, or if a transformation should BE a protocol? JC – The bigger question is “do you want to separate things that affect the EFFICIENCY of the calculation from things that affect the ACCURACY/RESULT of the calculation?” JC – Is the transformation tied to an environment (if the free energy difference between two mols in JUT a protein, or a protein AND solvent” DD – A transformation connects two chemical systems. A chemical system connects many components - Could be a protein component, solvent component, and ligand component. JC – If you only want to relate transformations that happen in the same environment,then you… Do you have one transformation per FE, or two? DD – I think we’ll have two. We discussed this yesterday in OpenFE meeting.
`ResultsNetwork` - should individual transformations encode their own `TransformationResult` with some required attributes/fields?	RG – From a free energy protocol, we have a results object coming off. We were wondering how unified the API for the results object should be. Like, what’s the maximum of things that we’d want from a results API. Like, we’re pretty sure there will be a delta G, but what else? will there be a container that would allow you to calculate which DDGs you want to calculate? JC – We have some standard statistics we make from the pyMBAR package - Energies, uncertainty, other estimate details. Needs to be in units. RG – Length is a good addition JC – can also be other properties you want, such as statistical fluctuation (how difficult the calculation was, how much additional simulation time needed to reduce variance by 1 unit) DD – We were considering the idea that each protocol… We have the RelativeLigandTransform object. JC – Is this an abstract base class? If we think this is a real, workable API, we should make an abstract base class. Can we import it? DD – Yes, it will probably make its way into GUFE. Right now this has a bunch of settings objects that can be used to define general transformations. What is the minimal/maximal set of things we need to know? JC – Minimum is (Free energy difference and units, uncertainty, statistical difficulty (measure of how much addtl simulation time is needed to reduce uncertainty) IA – What’s a transform here? Can people do enthalpy instead of free energy? JC – Could break this into deltaH and TdeltaS component. standardizing on `kT` as unit is a good idea also the results object should record how much work (e.g. simulation time) was required to arrive at result to calculate statistical fluctuation IP (chat) – we use this in openmmtools real time analysis output https://github.com/choderalab/openmmtools/blob/870d81ab5a751f666bbbc6a2b3d6a264c36f0e5f/openmmtools/multistate/multistatesampler.py#L1573-L1577 just in case JC – I’m particularly interested in hands-on testing and feedback JC will try out online notebook MH (chat) – https://mybinder.org/v2/gh/OpenFreeEnergy/ExampleNotebooks/april-2022 This is the best link to use right now to play with the notebook IA – still need to make sure we have a useful results API DD – If they all have dGs then that should satisfy minimal requirements IA – But if you do a RBFE vs ABFE, those have different meanings. DD – Some ideas we discussed yesterday were convenience methods/functions that consume a network and know which data they can use and how to interpret them to give useful answers. JC – Tricky thing here is that they must have the same atom mapping and same alchemicalt transformation. Otherwise they can’t be combines/compared. DD – I see. So that info needs to be accessible through the aPI JC – Yes, this is why I think we should separate “accuracy factors” (force field , atom mapping, alchemical transformation) from “efficiency factors” (which just affect runtime)
Defining API points for free energy submission/execution	JC – If OpenFE already has APIs laid out here, we should see if we can build to that. (This is referring to the API presented by the FEMethod base class) JC (sarcastically) – Do we like making people’s lives difficult? (We should call this FreeEnergyMethod)
(if time allows - low priority) - towards a defined means of storing historical free energy data	IA – Post discussions with D. Hahn, being able to store the existing OpenFF benchmark data somewhere in an easily accessible manner would be very useful for all involved. IA – Like, recording software versions for future interpretability. IA – Also, do we know where the existing calculated deltaG results live? DD – Maybe just on Janssen’s clusters right now. JC – We could use an exchange format for this data. DD – We could get the data from Janssen and store on Amazon S3. But I’m not sure how big this would be. What’s the minimum we’d need? Just dGs? Or more? JC + RG – The object we discussed above would be great. So deltaG, uncertainty, force field, setup info, version of protein-ligand benchmark. IA – I doubt they’re storing raw trajectory data. So the above items would be great. DD will follow up with DHahn to see how much of the above we can get from Janssen, and store it on an S3 bucket. RG – That would be useful IA (chat) – David Hahn did point me to this ~ hour ago, but I don't think everything is there: https://github.com/dfhahn/protein-ligand-benchmark-analysis
	JW – When will we be able to start determining whether we include/exclude user stories? DD – Once we make the next GUFE release, we’ll have tangible, defined objects that we can reason with to determine whether we can support various user stories.

Action items

David Dotson will reach out to David Hahn,

Versions Compared

Old Version 3

New Version 4

Key

Discussion topics

Action items

Action items

Decisions