2022-02-22 Protein-ligand benchmarks meeting notes

Participants

@David Dotson
@Jeffrey Wagner
David Swenson
Irfan Alibay
@Jeffry Setiadi
@John Chodera
@Richard Gowers

Goals

Review individual proposals, creating action items for DD to integrate back architecture doc
- DD will engage on individual stories as needed for more detailed information
Give DD enough information to stand up a project board for an MVP
- initial set of feature issues linking back to relevant stories

Discussion topics

Item	Notes

Item	Notes
User story #1	RG – As we’re developing OpenFE functionality, it’d be helpful to be able to submit a single edge/small packet of simulations. concern as to how much is specifiable via straight XML, vs. whether executable code must be serialized with it JW – OpenMM largely deals in XMLs, depends on what can be serialized into XMLs JC – Could add support for computing a specific edge given a specific protocol achieving some consistency between how to encode this with Gromacs and OpenMM; RG – Kinda chicken and egg – I want to use this as a platform to experiment with different ways of defining edges, so I don’t know what the edges will look like ahead of time JC – one constraint is we can’t support arbitrary python execution on volunteer hosts; execution of simulation itself must be supported by the `openmm-core` on Folding@Home; we need to articulate what class of execution we need to be able to execute RG – XML will prevent some level of danger from malicious inputs, rather than .py files JC – We could scope out some basic styles of input, like computing dU/dlambda. expanded ensemble, with some adaptivity, (audio dropped) Would want to compute things in a series of edges, like a batch submissions RG – Why would things need to be batched in submission? JC – Some technical/scale details, but the turnaround will be multi-day, so if it’s just a short single simulation RG – want edge-level granularity for submission to avoid burdening the system / human managing with understanding issues JW – Single “vanilla” simulation submission could support debugging - We have issues with our QC compute where everything is so large that we can’t actually reach in to debug/understand rare failures. IA – We’re probably asking about similar things, so like controlling variables that specify want to be able specify what lambda states, number of replicas; be able to vary that but keep the method the same? JC – assume you’ll want to evaluate this for more than one edge; be able to submit them in batch to actually have an advantage from F@H JC – I’d be happy to continue discussing this offline, there are valuable use cases here. DD – Agree, an “hours”-level turnaround option for small extraordinary jobs would be useful in many contexts. JC – It’d be helpful to understand the anticipated workflow that OpenFE would want, like the “daily pacing” of this sort of interacitons. JW – we’re learning about fun bottlenecks with QCArchive with F@H, is there some notion of what scales of submission work well and what do not? JC – We can’t be constantly doing small submissions to F@H, this is technically/organizationally infeasible. So a lot of this project will be understanding how we can bundle JW – is bundling to reduce frequency of access operations, or the number of entities being tracked, or somethign else? ? JC – I can flesh this out better by understanding and discussinguse cases
User story #2	IA – if you’re using OpenFE components, want to have useful outputs from failure need to be able to understand how certain transformations perform JC – being able to access relevant metadata (may not be entirely clear what is most useful, in what form, upfront; will evolve over time) RG – being able to search for morphs by SMIRKS would be incredibly useful for this, otherwise searching could become a pain Looking at open reaction database to see if there’s concepts/infrastructure we can reuse JC – would like to be able query against all calculations that have been performed, such as carbonyl → alcohol? RG – exactly JC – could be another component that indexes the database, doesn’t necessarily need to be part of the core implementation RG – Agree, this could work as a secondary index JC – want this approach to allow others to consume the API for other applications, e.g. build a secondary index that allows queries by e.g. SMIRKS JW – ran into the issue with QCArchive that only OpenFF molecules had CMILES, allowing this functionality; also the REST API didn’t expose the functionality needed to make this fast JC – I want to make a user story about mutations in a protein/host, which could also benefit from a SMIRKS field. But we’ll want to think about whether a chemical transofrmation ina protein shoudl appear ina search for a chemical transformation in a ligand. DD – We have two ways in which the data can be exposed: 1) The API, and 2) S3, the object store. So folks can have services that ping the bucket directly and index it. I think this is a good idea based on my experience wiht QCA, where I can only hit database tables and not the underlying data. JC – Agree, we will really benefit from allowing direct access to underlying data, as long as we have and enforce a clear schema for each entry.
User story #3	JC – MS has implemented several FE methods in gromacs. several classes he would like to perform running lambda states that may not overlap expanded ensemble of multiple states in a single simulation adaptive scheme to execute updates to weights DD – A lot of what’s discussed in this story would already be implemented in the GROMACS and OPenMM cores, right? JC – The functioanlity is so simple that implementing functionality from the GROMACS core into the OPenMM core could be very simple. This would be valuable for OpenFE as well - Allowing similar runs using either core will be quite valuable DD – So this is already implemented in the GROMACS core? And we just need to be able to pass inputs through? JC – Correct. JC – The Gromacs core is CPU-only, but will have GPU support soon. OpenMM already does both. DD – So, on the project board for the MVP, I may have some items that will get pushed to the openMM/GROMACS core repos to-do list. JC – Right, I don’t think we cann give people access to those, due to conditions outside my control IA – Both from OPenFE’s and a wider research perspective, being able to set up protein ligand benchmarks and running them using every benchmark under the sun, that would be really useful. For example SampL 6’s host guest sutdies were particularly useful
User story #4	JS – Looking at doing ~100s ti ~1000s of calcs. Could either do APR or alchemical values JC – if we include setup of different lambda values with restraint states then it essentially does attach-pull-release JC – AMBER would likely be out of scope. Would OpenMM-only be ok? JS – Openmm-only would be fine. JC – could be a plethora of methodologies that could be run on the same cores; this story could drive development of those in OpenMM JS – would like to be able to read metadata in Taproom JC – It’d be good to understand how the interface with the propertycalculator would work. JW – touches on organizational question; if propertycalculator and F@H meet, what does the interaction look like? Can F@H consume PropertyCalculator input schema? Does PE have the ability to submit compute to F@H? Or do they never meet? I need to talk to protein FF team to understand whether they expect to use F@H for protein observable benchmarking. JC – no reason the second option couldn’t work, where F@H is used as an engine DD – We want to be able to support either, no reason they’re not both possible. JW – Some question about whether we’d want the F@H interface in the FF optimization loop, or if we’d never have it in there JC – one way to do a closed loop is to run free energy calculations DD – So, it could happen in a varierty of ways. DD – I’ll follow up on the issue tracker to gether more details of what we’d expect to have happen on the server.
User story #5	JC – This is my first user sotry, I’m planning to add many more JC – this uses F@H as a way to prioritize compounds based on binding affinity for drug discovery campaigns; doesn’t necessarily need complete accuracy, needs to function as a priority list/ranking JC – I wrote about some possible access patterns and their different values. JC – need a component that gives which co-crystal structure co JC – I will come through and try to enumerate components that could combined to support my workflows, and be reusable for other user stories as well. DD – I’m planning to take a first shot at defining discrete items/compoennts , linking to the relevant JW – mentioned a component that replaces missing atoms; is this part of the system, or externalized? JC – many use-cases will deal with pre-curated systems where this is standardized; some may be dealing with more raw crystal structures there may be components from the OpenFE work that helps deal with these; identifying what these components are, which stories, require them, will help determine whether these fall inside or outside the box conclusion may be that this kind of component isn’t a priority for OpenFE near-term based on partner need JC – There’s a possibility of large influxes of money/developers to further support this. So we’d like to be ready to implement a large number of new components. JW – So, if we could choose to pay upfront to make this different degrees of “extensible”, this would argue that we should pay a lot now to make this “extremely modular”, to avoid friction down the road. DN – Yes, this is a helpful context to consider future uncertainty
User story #6	JW – OpenFF’s use case for PL benchmarks need to prioritize is consistency; if we have workflows that involve docking or other judgments, will be drift in underlying implementation, need to be able to do constant benchmarks without ground shifting underneath want it to be really easy to get a clear result on whether we’re doing better on a set of systems want to be able to pin dependencies as much as possible JC – which is more important: being able to run with old things or being able to run everything with a newer version of software packages JW – possible with different underlying steps that an inferior forcefield would appear to be better than a prior one JW – This may touch on whether we want to be able to provide a docker image or conda yaml to exactly reproduce previous conditions DD – There’s a finite allowance for the variety of cores that we can support, once we start having too many permutations we’ll get in trouble. JW – So, we could handle the complexity in three places: On volunteer hosts, which will be most difficult On the submission server that DDotson is making If we allow “no questions asked” basic simulation submission, then the user can have arbitrary complexity on their laptops, and still take advantage of F@H compute
	JC – If we allow arbitrary access of data, then we could submit huge amounts of compute (like all of biningdb) and enable community research of failure cases/software shortcomings. DD – Agree DD – I’ll work on some discrete developments/issues/components for our next meeting, and will ask for further clarification on user stories as needed. Be sure t keep your eyes open on the issue tracker do I can help you in a timely way!

Action items

@David Dotson will set up project board for MVP

@David Dotson will populate backlog of MVP project board with feature issues linking back to user stories, engage stories for clarification where needed

2022-02-22 Protein-ligand benchmarks meeting notes

Participants

Goals

Discussion topics

Action items

Decisions