2022-08-09 Protein-ligand benchmarks meeting notes

Participants

Goals

DD : fah-alchemy - current board status
- fah-alchemy : Phase 1 - MVP
- 3 weeks out from 9/1 deadline for biopolymer benchmarking
- gufe#39 merged; turning attention to gufe#36 to unblock perses#1066
  - also working to resolve gufe#42 and gufe#45, which will complete coverage of our GufeSerializable approach across existing gufe objects; thanks to David W.H. Swenson for detailed work on these
- starting up MVP of Executor, Scheduler, and ResultServer for FAH this week
MH : ProtocolSettings taxonomy update
RG : virtual sites and suitability of the PLB dataset
IA : protein-ligand-benchmark - 0.3.0 update
- best route forward for resolution of #52? Drop targets with wrong protonation relative to assay?
- short term alternatives if timescale for resolution longer than 3 weeks?

Discussion topics

Item	Notes
DD: `fah-alchemy` - current board status	fah-alchemy : Phase 1 - MVP 3 weeks out from 9/1 deadline for biopolymer benchmarking gufe#39 merged; turning attention to gufe#36 to unblock perses#1066 also working to resolve gufe#42 and gufe#45, which will complete coverage of our `GufeSerializable` approach across existing `gufe` objects; thanks to David W.H. Swenson for detailed work on these #36 is about the DAGs created by protocols. Has a few different constraints, for example it doesn’t need to be super stable. starting up MVP of Executor, Scheduler, and ResultServer for FAH this week JC – To help JS catch up - We want to reuse as many free-energy-on-F@H components as possible between OpenFE/OpenFF/Chodera lab. So this is targeted at doing simulations using openmm on folding@home, with switching about every 1ns. JW – Anything I can do to clear your schedule, or anything else I can do to equip you with what you need? DD – I’ll probably stop attending QC meetings. Though I’ll still make the error cycling JC – We’d be OK to run the OpenFF benchmarks on our local cluster using perses in the near term if the F@H interface isn’t ready in time. IP – I don’t think we’d have the throughput on lilac to do this in a week. IA – OpenFE needs to get a benchmark done for the board anyway. Planning on ff14SB + openff-2.0.0. JW – Great - That’s basically one of our needs as well. So let us know if we can help unblock this. IA – Gotcha. Right now we’re largely compute-limited, but I’ll let you know if there are places that OpenFF can help. It’s likely that we’ll just run a partial benchmark due to compute constraints. JW – Great. Let’s stay in touch in future meetings. IP – Will these results be published somewhere? IA – We weren’t planning on a scholarly publication, but we will post it online and make the data available. JC – which version of the FF do you need benchmarked? JW – need a series run with Sage, then a series run with Rosemary may not take the offer to run on Lilac; Chapin is just starting to run the torsion fits, so there may be a few iterations before we have a Rosemary FF to benchmark do you have FFs in mind IA for your benchmarks? (see above for answer) IP – IA do you plan to disseminate these results somewhere? IA – this is intended for consumption by the board, but also for ACS will definitely put these in a place that is accessible by others
MH: `ProtocolSettings` taxonomy update	MH – Still working on it. MT recently got the models/typing machinery split out to another package. I have some feedback for him and will initiate a round of back-and-forth. So this is on track. DD – Do you get the sense that this will be ready to go in 3 weeks? MH – You’ll have something usable from this effort in 3 weeks. There’s no danger of this turning into a blocker. JC – On this “level 1 protocol settings”, which is used to completely specify the FF that isn’t always present – There was a recent paper from Junwei and MacKerrell criticizing use of GAFF2 with AM1BCC. They argue that there’s missing information and that causes deficiencies in how FFs are used. So maybe we could use your work as a starting point for a LiveCOMS articles on best practices to accuracy/reproducibility in simulations. There will of course be further refinements, but it would be great to capture this progress for the community. Would anyone be interested in participating in this best practices paper if I lead the outlining? It would be particularly useful if folks working on the “level 1 settings” could contribute - IAlibay, MHenry, RGowers, LNaden, MThompson, LWang. LNaden - I’m up for this MHenry - This seems like a really good idea - It will be great to have a paper for people to cite. DD – Knowing that there will be resistance form some groups, do you think we’ll get enough buy-in from other areas of the community to get traction? JC – I think this will get enough buy-in from the rest of the community. DD – Agree that this will be a good strategic move. JC – Yes, I’ll lead the outlining process and outreach, and will hand off the full manuscript writing to a different driver.
RG: virtual sites and suitability of the PLB dataset	RG – talking to our board, their interest in virtual sites for FF directly related to how well benchmarks perform not a whole lot of halogens in the benchmark set; will this require virtual site considerations? JW – first vsites we would add would be for halogens and aromatic nitrogens selected because we can generate QC data for them; also have decent amount of experimental mixture data, so can fit both to experiment and QM RG – if you’re doing free energy benchmarking, may not show how well you’re doing for halogens JC – OPLS4 proudly advertises vsites essential for accuracy for e.g. sulphur new Drude FF also has vsites So you could look at the above papers for FE benchamrking targets that would have vsites JS – In my time at Cresset, had some experience with this JC – Re: simulation speed, gromacs people replaced Hs with vsites to preserve runtime IA – there are also JW – one thing we’ve had trouble finding is physical property datasets for things with sulphurs so don’t have experimental data to train and test on divalent sulphurs JC – not enough info in thermoml to support this training? JW – yes, once we filter down to room temperature paramters there isn’t much left to work with RG – It’ll be a little harder for us to “sell” the idea of supporting vsites to our governing board if we don’t have benchmarks that will show their utility. JW – I can basically say with good confidence that we won’t make a FF release with vsites for at least a year. We can’t accurately track effort that far out, and so if you also can’t, then there’s no need to fret about this.
IA : `protein-ligand-benchmark` - 0.3.0 update	best route forward for resolution of #52? Drop targets with wrong protonation relative to assay? short term alternatives if timescale for resolution longer than 3 weeks? MB – Some difference between current state of targets and how they’re generally done in papers. In papers, they generally prepare stuff at pH 7. But if you go into the papers, they have different assay conditions. BACE is a tough one, since most of its assays are run at pH 4-5. The other are run around pH 7-8, so I prepped everything at 7.4. https://docs.google.com/spreadsheets/d/1TJX1c0zC_UCwVHXO8vE3f_jA9B0H39X0HjnjM9R9daM/edit?usp=sharing JC – CBayly said that the ligands might have a weird protonation state, since it doesn’t tkae a ton of energy to transition them. MB – When I do docking studies, I enumerate ionization+protonation state. I use Schrodinger EPIK, which carries a score penalty for unusual states in docking/downstream uses. But I don’t know what this project plans to do to correct this. If we include multiple states then we’ll double or triple the size of our datasets. JC – I think we should carry around multiple states of each ligand. If we knew which state was the bound state then we could only use that, but what we’ll probably need to do is run all of them and combine the scores into a macrostate MB – I’ve been putting together a commandline version of this prep to encode the settings that are being used. MB – One stitch I’ve encoutnered is that schrodinger can’t remove excipients, so I’m doing that kinda manually. IA – We could provide tools to do this. MB – I can make tools for this too, but one thing is, for example, things like xtal waters that are kept if they make enough hbonds, but there can be a similar thing with excipients where we keep them if they’re close to the binding site or making important contacts MB – We think the best approach is to continue with the current set at pH 7.4, drop targets with egregiously different assay conditions (eg. PDE10 is the rat gene, not the human gene, others were run at totally different pH) redo preparation with assay conditions DD – That sounds like a good approach. Any objections? Decided to proceed with MB proposal above; she is aiming to include CLI tool that produces the prepared target, docked ligands JW – Keeping multiple protonation states? MB – We’ll only keep one protonation state in this set. EPIK can do clever things if you keep everything in the schrodinger ecosystem, but it may not be applicable to our work. JC – But our work here COULD be expanded to JW – Keeping excipients? MB – We’ll remove all excipients at this time. IA – Also, there’s a open on the PLBenchmarks repo to solicit feedback about how to store assay conditions. MB – Especially interested in feedback on reducing conditions and other factors - I made a comment about that in the linked issue. JC – ReDesign Science had prepped a set for protein-ligand benchmarks. I linked that in the above issue. IA – One item on my to-do list is to check with the IC50-energy calculations that they had found to be incorrect, and ensure that things are correct

Participants

Goals

Discussion topics

Action items

Decisions

0 Comments