Technical overview
| David Dotson
| Overall architecture - High level details DD – JC – Advantages of colocation would be that Chodera lab could cover hosting expenses for OpenFF (both technical and personnel). JC – I’d like to share as much as possible with OpenFE - Molecule transformations, network mapping, etc. JC – Can make data available on AWS for redownload, for people to download at their own discretion DD – Could save data on AWS S3. Different tiers of speeds/schedules, could be accessed by a public URI. DD – Interested in getting output spec information from OpenFE.
JC – What’s missing here is info about how inputs are specified. JW – how do we avoid versioning hell on the stuff inside the green box? JC – First thing to standardize what goes in and out of the green box. Could freeze an env JW – how do you do this where there aren’t any automation details for free energy calculations in the green box? JC – what you get is a person doing all the work of generating inputs, which manifests as a slow and expensive feedback loop we want to be able to make iterations fast and cheap; this is scalability being able to get information about failures is key for FF development / drug discovery / advancing the free energy calculation infrastructure
JC – would like us to come up with inputs that are a starting point, then we can iterate on them to build out downstream components
Current plans/functionality JC – ideally we are in a position where in 3-6 months we have a functional green box that supports one engine JW – would like to see free energy benchmarking results, want things to be reproducible and isolatable network planner, system generator, etc.; need to be able to keep everything constant except one component, etc., ability to run locally with only suspected problem inputs
JW – do we get to specify which version of Gromacs / OpenMM we want to run on client machine? JC – to some extent, yes; within a few recent releases can specify core versions JC – public benchmarking data will be the only way for OpenFE to identify failures; won’t be able to operate on proprietary data
JW – how do we want visibility into errors; how do we delete or invalidate previous errors? are we planning on having the ability to delete? JC – We’ve definitely needed to trash datasets/results before. JC – It’s become helpful to be able to pause or delete individual runs/edges. JC – an adaptive supervisory process is possible later on that can add or delete edges in a network graph and propagate those changes into the work server JW – Generally agree. This could become complex so I’d be flexible on this
Collect use cases/user stories DD – Let’s populate the issue tracker of the following repo: issue tracker JS – How black-box/agnostic would this be? JW – I think JS may have multiple individual requests - Things like “make the inputs agnostic enough to take host and guest systems” and “allow people to use pAPRika as a workflow component” JS – Could this run forcebalance optimizaitons internally? JW – I don’t think so, but this could be a backend for optimizations if we engineer it right. DD – Yes, if we know more details about the sorts of calcs that FB will submit, then we can look at whether it’s easy to make that compatible with our input format.
JC – I’d talked to RGowers about shared object models/APIs, and keeping touch about that. Identify functionality gaps Use cases need to eventually be written, in fah-alchemy GH repo issue tracker DD will make an issue template: Aspirational API, edge cases (must support/should support/shouldn’t support)
|