| |
---|
protein-ligand-benchmark issue divide-and-conquer for milestone 0.3.0
| DD โ IA and others have been identifying items to clean up the protein-ligand-benchmark repo DD โ #30 assigned to JC, will delegate to Melissa DD โ Three others are delegated to Lorenzo DD โ #31 delegated to JC DD โ Units migration delegated to IA. DD โ LiveComms citation and docs delegated to MH JC โ Migrate to github pages? MH โ Iโm in favor of RTD - already exists for this repo, simpler permissions, etc
DD โ Jnk1 charges? RG โ I think this had funny JC โ I think we should toss jnk1 since it doesnโt emet our quality standards - โNot trustworthyโ xray data MH โ I believe we are planning to delete this in #31. RG โ May still be worth reaching out to DHahn, if heโs been using this. It may be that we need to update this with what heโs been using (for the record) and then deprecate it. JC โ I think it fails multiple inclusion criteria - It doesnโt meet the xray standards OR the dynamic range standards. So Iโd be in favor of directly deprecating. If there are better jnk1 structures we could use those.
RTD on PR builds MH โ I can handle this, if you give me permissions JW โ Iโve reached out to DHahn to get us permissions
(some others, recorded on GH issue tracker) JW will disable LGTM DD โ Adding citation.cff IP โ Have you discussed separating API from the data itself? This repo has BOTH data AND the API to access it. Itโs kinda a pain that you have to download the whole dataset to just get the reading code. JW โ Who are reviewers and mergers? DD โ Anyone can review. Only one review is required RG โ DH opened #36 last week to recommend multiple reviewers. MH โ Itโd be cool to have policies and pull request templates - Like the โadd a new targetโ template could have a checklist with the steps required to get it merged. RG โ Should that be a target for the 0.3 milestone? DD โ Nah, letโs plan it for the 0.4 milestone
ย
|
any value in Transformation distinct from Protocol ? Better to remove an abstraction layer? | DD โ Iโm working on GUFE #13 - Trying to get it ready for review. Iโd love to have a working session with someone from OpenFE to move this forward. RG โ Yeah, Iโd be happy to work with you this week. RG and DD will work to push this PR forward later this week
DD โ Possible related to the โsubmission/execution APIโ point below - Iโve been thinking about how to do this. Up until now, Iโve been thinking about transformations as having their own protocol. But that may not be necessary to do aโฆ Is there a useful distinction between a transformation and a protocol? RG โ The transformation object is the mapping object in this case, right? JC โ If youโre computing the FE between two speciesโฆ There are a few consideraitons. You can do replicaa exchange, lamda switching, etc. But then thereโs ALSO the possibility of different atom mappings. So we should answer what weโre looking at here - Just free energies of two species? What if we switch the protein? What if we switch the solvent? Different atom mapping will also lead to different efficencies. Then thereโs also the question of different force fields. DD โ Along those lines, my question is whether a transformation should HAVE a protocol, or if a transformation should BE a protocol? JC โ The bigger question is โdo you want to separate things that affect the EFFICIENCY of the calculation from things that affect the ACCURACY/RESULT of the calculation?โ JC โ Is the transformation tied to an environment (if the free energy difference between two mols in JUT a protein, or a protein AND solventโ DD โ A transformation connects two chemical systems. A chemical system connects many components - Could be a protein component, solvent component, and ligand component. JC โ If you only want to relate transformations that happen in the same environment,then youโฆ Do you have one transformation per FE, or two? DD โ I think weโll have two. We discussed this yesterday in OpenFE meeting.
ย
|
ResultsNetwork - should individual transformations encode their own TransformationResult with some required attributes/fields?
| RG โ From a free energy protocol, we have a results object coming off. We were wondering how unified the API for the results object should be. Like, whatโs the maximum of things that weโd want from a results API. Like, weโre pretty sure there will be a delta G, but what else? will there be a container that would allow you to calculate which DDGs you want to calculate? JC โ We have some standard statistics we make from the pyMBAR package - Energies, uncertainty, other estimate details. Needs to be in units. RG โ Length is a good addition JC โ can also be other properties you want, such as statistical fluctuation (how difficult the calculation was, how much additional simulation time needed to reduce variance by 1 unit)
DD โ We were considering the idea that each protocolโฆ We have the RelativeLigandTransform object. JC โ Is this an abstract base class? If we think this is a real, workable API, we should make an abstract base class. Can we import it? DD โ Yes, it will probably make its way into GUFE. Right now this has a bunch of settings objects that can be used to define general transformations. What is the minimal/maximal set of things we need to know? JC โ Minimum is (Free energy difference and units, uncertainty, statistical difficulty (measure of how much addtl simulation time is needed to reduce uncertainty) IA โ Whatโs a transform here? Can people do enthalpy instead of free energy? JC โ Could break this into deltaH and TdeltaS component. IP (chat) โ we use this in openmmtools real time analysis output openmmtools/openmmtools/multistate/multistatesampler.py at 870d81ab5a751f666bbbc6a2b3d6a264c36f0e5f ยท choderalab/openmmtools just in case JC โ Iโm particularly interested in hands-on testing and feedback IA โ still need to make sure we have a useful results API DD โ If they all have dGs then that should satisfy minimal requirements IA โ But if you do a RBFE vs ABFE, those have different meanings. DD โ Some ideas we discussed yesterday were convenience methods/functions that consume a network and know which data they can use and how to interpret them to give useful answers. JC โ Tricky thing here is that they must have the same atom mapping and same alchemicalt transformation. Otherwise they canโt be combines/compared. DD โ I see. So that info needs to be accessible through the aPI JC โ Yes, this is why I think we should separate โaccuracy factorsโ (force field , atom mapping, alchemical transformation) from โefficiency factorsโ (which just affect runtime)
ย
|
Defining API points for free energy submission/execution | JC โ If OpenFE already has APIs laid out here, we should see if we can build to that. (This is referring to the API presented by the FEMethod base class) JC (sarcastically) โ Do we like making peopleโs lives difficult? (We should call this FreeEnergyMethod) ย
|
(if time allows - low priority) - towards a defined means of storing historical free energy data | IA โ Post discussions with D. Hahn, being able to store the existing OpenFF benchmark data somewhere in an easily accessible manner would be very useful for all involved. IA โ Like, recording software versions for future interpretability. IA โ Also, do we know where the existing calculated deltaG results live? DD โ Maybe just on Janssenโs clusters right now. JC โ We could use an exchange format for this data. DD โ We could get the data from Janssen and store on Amazon S3. But Iโm not sure how big this would be. Whatโs the minimum weโd need? Just dGs? Or more? JC + RG โ The object we discussed above would be great. So deltaG, uncertainty, force field, setup info, version of protein-ligand benchmark. IA โ I doubt theyโre storing raw trajectory data. So the above items would be great.
DD will follow up with DHahn to see how much of the above we can get from Janssen, and store it on an S3 bucket. IA (chat) โ David Hahn did point me to this ~ hour ago, but I don't think everything is there: GitHub - dfhahn/protein-ligand-benchmark-analysis
|
ย | |