Docs/PTM roadmapping
Framing discussion - We want the brains in this room to help us plan through things. This includes providing opinions/viewpoints that we aren’t adequately considering, identifying fundamental tradeoffs of approaches (ex bespokefit 2 and reproducibility), and where possible coming to a consensus that one things is better than another.
Feedback desired - What’s a good way to turn these into more closed-ended questions that are amenable for discussion and roadmapping?
PTM handling
How do we/which parts of the ff14SB PTM workflow do we deprecate once Rosemary comes out?
Suggestion: 6 month support window for old workflow after Rosemary release
Suggestion: We make no guarantee of support for any of the components of the early workflow once Rosemary comes out, but if there are API points that become widely used those may be upstreamed
How do we communicate the level of support of this workflow? How do we triage technical (“it doesn’t recognize my atom names”) vs. scientific (“these parameters don’t seem accurate”) support?
Suggestion: We say “the canonical AA parameters are 100% ff14SB, and the others are Sage with NAGL charges. They are philosophically compatible but use your judgement and report any issues to us”.
DM – It’d be cool to have a big board of “here’s a bunch of cool ideas where we hope that someone will do them”. It can also help to justify grant applications by showing that there’s an unmet demand. So collecting use cases/inputs that we can’t handle (but maybe someone else can) could be useful.
JE – You might start labeling things out of scope on the issue tracker to help highlight them as open to community work. And grouping
Mt – Who is audience? How are we going to make sure the message reaches them? This is something we historically haven’t done too well on so we will want to be deliberate about messaging people.
DM – Yes, specifically ex RAlford at Janssen and other folks we know are doing PTM work. There are roughly two categories of users:
People who pay us and we know want this and Vincent Voelz
Everyone else who will stumble across it if we advertise it (many of whom will naively try to put in silly things)
JW – A staged approach of reaching out to larger groups gradually would be good and would keep feedback on target
Add to Zenhub
AF – Can I share this broadly now?
DM – Folks who know the context now would be good.
DM – Re: functionality - If I want to use this, I need to generate a SMARTS, hand-tag it, and then match atom names to the contents of the PDB file. What if someone had ex. an openeye license and wants to use their PDB loader
JW – Some contact with openEye about this but they haven’t followed up
DM – What if I don’t have OE and don’t want to type a bunch of atom names?
JM –
JW – Can add
LW – Will molecule.from_pdb_and_smiles still be supported?
JM – Yes.
TB – So if you have CONECT records you can use the old way. With the new way, you need to unique atom names, and that’s enought for pablo?
JM – Exactly. EITHER conect OR atom names.
TB – The way I’ve approached this is is somewhat overlapping with the prototype shown today. You make SMILES monomers with wildcard atoms at connection points. This allows for things other than single bonds at connecting bonds. Then you can enumerate reactions to build up polymers.
.
Should some sort of “react_molecule” method be permanently added to our public API to help stick small mols onto canonical proteins? Does this represent a shift towards supporting more system-preppy functions (eg,
Topology.solvate()
etc.)? Alternatively, these API points could remain informal and live in jupyter notebooks/utils folders in one-off repos.How do we want to assess accuracy of PTM parameters? What guarantees do we make and how do we communicate them?
Suggestion: Currently our philosophy is “see what people complain about/report”. Is it worth revisiting this/is there some more granular way to approach this?
Docs
Are docs working?
MT – The docs repo is broken and has been broken for as long as JM has been pulled away on other things. I keep hearing that JM is being committed to high-value things that aren’t docs.
JM – In what way are docs broken
MT – I get emails every night from actions that are broken.
Add to iteration planning
JW – Also, when we commit to doing too many things, things break
How confident are maintainers and users in the automated API docs?
Is everything you expect to be documented included?
Do you notice any errors?
JC – In theory I care, but I quickly resort to looking at source code directly.
MT – Most (70%ish) new/casual users go straight to examples. The most expert 20% ish look at source code. The middle 10% go to API docs.
LW – I do what JC does (read source). But I find examples helpful to point to major API regions.
AF – Agree with LW.
JH – I usually go to the source code, use docstrings. I’ve never gone to API docs.
BM – I don’t generally use API docs. I first do text docs, then source code, the ask Lily.
JH – For other packages I DO go to API docs, but since I’m so close to OpenFF devs I go to source code.
AF – Most of our rotation students and undergrads get directed to docs pages.
AF – In general they work, but there have been a few times I’ve found discrepancies and that’s when I go to the source code.
JH – Seconded. Happy to take another pass and provide feedback
JM – would love feedback. If we get feedback that docs need improvement/fixing. The best place for those issues would rbe the project docs themselves or the openff-docs repo.
MT – Two suggestions
We have lots of “this is unstable” warnings plastered all over our packages. Not sure what purpose they serve.
JM – Completely agree. With examples we’ve decided to have production vs. epxerimental examples, with code
JW – I’m in favor of getting rid of the “this is epxerimental” cruft in the toolkit
Add to Zenhub
JM – And what if we had an “experimental” decorator that extends the interchange style env variable gates
Add caveats like the packmol thing
We should go through examples and remove usages of private methods, and/or be more conscious of those.
Add to Zenhub
Are maintainers confident they can control which items appear in the API docs?
LW + JW – Unsure which levers to pull to get things to render in our packages
MT – I think I’m pretty able to do things
JM – Interchange and nagl are pretty striaghtforward (just need to add to ALL), toolkit is its own beast. Could contribute to sphinx to make this better.
JC – I’ve generally had a smooth time of this with autosummary template
Do users feel like they can find the documentation they need?
Are there any recurring frustrations with the API docs?
Are there any other recurring frustrations with documentation?
How do we want to prioritize improvements to the docs?
Additional type annotations
TB – I like type annotations, especially when you have complex return types. Think the docs integrations are really nice.
LW – I’m in favor of type annotations. I like what JM has done.
AF – Agree
JC – Agree
WW – Agree
PB – Agree
JH + BM – Neutral
JW – I find they add a lot of friction to my development, but likely because of cruft that’s my fault
Docstring updates/corrections
(General) – This doesn’t seem to need JM’s time
New/updated software for automated API docs that might solve issues mentioned above
MT – I’m overall happy with our docs infrastrucute. It’d be nice for them to build faster. I’d like the automation to be better, since docs tests can be a good way of knowing that something broke, so as long as the automation is robust and doesn’t give false positives that has a lot of value to me.
Virtual workshops
These are likely cancelled due to time constraints for the 2024-2025 roadmap year. Is this the right prioritization?
Which topics would be good to do workshops on for coming year?
Multi-vignette workshops vs. big single-topic workshops?