| |
---|
General updates | |
introduce gufe ; shared data model approach
| RG – DD and I got together over the past week and realized that we have a lot of shared objects in common. Namely, objects that represent chemistry, so we should be able to share maintainance of those objects. So we were thinking of having a standalone repo for data representations. RG – Looking at “stable release” in a few weeks. Goal is for this to be a package that DD can work with during early design phases.
RG – not sure about sharing edges yet, even if nodes are shared between openfe and fah-alchemy ; will become clearer as edges geared toward fah-alchemy are created as to whether these have shared use JW – is it accurate to call this an interface? RG – this is a data models package, first and foremost JC – see this as a definition for the inputs to a calculation, and perhaps the outputs, but not the compute components that implement e.g. protocols JW – Can this be seen as a SUBSET of the final interface? JC – Maybe, will require further discussion. But this could certainly follow a plugin/base class implementation.
DD – I view this as basically the running design of components for the INPUTS. The design of the results objects isn’t being finalized yet.
|
merits of using openff.toolkit.topology.Topology for ChemicalState |
JC – There’s a need for a tagging system that identifies “this is a ligand” or “this protein needs to be mutated”. The OFF Molecule handles names and other metadata in conversions to/from RD and OE mols. Some hesitation about loading from PDB. DD – Big issues seemed to be hashability and mutability. The OFF Toolkit’s chemical representations may be too heavy for OFE’s needs. But the information content of these two things are fundamentally different. DD – JC – OFE component hashability = ? RG – Order, Element, formal charge, connectivity, NOT positions
DS – Equality needs to be “correct”. Needs JC – OFFMol could support multiple definitions of equality. … JC – Definition of “Ligand” - Could be a small mol, or a peptide, or… JC – Atom order is important, will be needed to understand how to interpret coordinates. Eg, crystal waters RG – one thing we do different to the openff Topology is that we don’t have explicit solvent A protein component could contain crystal waters; not just the amino acid chain JC – the biological unit is the more operative term here, which often needs to be generated from a PDB entry; e.g. a dimer
JW – You’re trying to understand your requirements, and so a couple strategies an interface strategy, where requirements can change and you can swap out underlying implementations direct use of say an openff Molecule
RG – we have basically subclassed openff Molecule but removed the things we don’t want/need it to do this is valuable because it simplifies our surface, avoids behaviors we don’t want to support we don’t have vsites, for example, which are a FF-support feature; out of context for this data structure
DD – I’m for the composition / interface approach JW – I’m fine with the interface approach RG – I think we can put an openff Molecule in there; wasn’t sure if the serialization would change, however, so that made me nervous JC – think there have been more broken RDKit packages, but point well made if there are things you think you need, can you pass these on to the openff developers? These will help improve the interoperability of the surrounding stack RG – we don’t think equality is wrong, but just differen’t; didn’t feel it was an arguable case toward a change in openff toolkit JW – I think this repo can be a useful point for determining what’s possible for movement in openff toolkit JC – happy to move forward with this approach if use-cases can be made to work with them
DD – Proposal: Let’s proceed with this composition approach (as implemented in GUFE) for chemical components. JC – I think that there will be issues on this route, but thsoe issues will illustrate actual needs (All approvers +driver agree)
|
consider alternative names for gufe | JC – Interested to know about JW’s decision to refactor openff packages into a common namespace. I’d recommended doing the same thing in GUFE issues/6. JW – This makes it easier for users to find packages namespace approach is cosmetic, but works as implicit documentation; indicates the bounds of our ecosystem, what’s intended to work well together vs. what isn’t necessarily if you’re not married to any names, now, recommend doing it however, not clear to me how many users would be turned away by not having it
RG – understand pov on namespace package; do feel however that namespace packages are counterintuitive, and in particular if you type import openff nothing comes in by design JC – have you considered metapackages? DS – We’ve had this disucssion before in OpenFE and considered these ideas. We’re happy to hear if there are new reasons to change our decision but I did propose the metapackage idea before. RG – there is a concept of branding here, do see the value I don’t want to go for an approach with openfe- because it appears exclusionary to new contributors that are not affiliated with the org prefer it to look like an undifferentiated zoo so that it is actually more inviting
JW – agree with your approach of a more disparate constellation of software intentionally NOT branded together; appreciate that you’ve given it thought JC – is gufe globally unique?
|
Questions for JC - FAH project onboarding process | JC – Best way to do this is to have a deep dive with whoever is doi. They’ll need input files in a particular format (there are example scripts for OpenMM, for GROMACS you’ll need to get in touch with VVoelz group) Have to get onboarded via Bowman group would spin up a WS, likely in Chodera Lab environment would need input files for benchmarking and testing; can get in touch with Voelz group for gromacs Operative concepts are PROJECT, RUN, CLONE, GEN
DD – Is it possible to remove all human touchpoints form this? JC – Yes. You’d place files in a particular path, generate an XML file describing the contents of the project. Then there’s a REST API to talk to in controlling the project. For manual projects there’s a testing/rollout phase at small scale to ensure that it doesn’t crash lots of volunteer computers. But once the automation has had some trust/experience in production we can probably skip those. DD – So creation of a project doesn’t have human touchpoints? DD – For a star graph with RBFE, how many projects is that? JC – Two projects - One for the ligand+solvent system, one for the protein-ligand complexes. Though there is a way around that to get it down to one. DD – So, it’s system size that draws the boundaries between projects/limits project size? JC – Yes. Basically I imagine there being “buckets” for groups of atoms… DD – Good, that will be the usual pattern.
|
Plans for advancing to having a backlog? Decision on including/excluding user stories? | |