Goals Get this to a point where OpenFF can pick it back up after TG has left Document it so ideas come across, even if implementation isn’t ideal
Ways forward Identify what we need out of BeSMARTS, then write an API wrapper, tests, and examples for that subset of BeSMARTS that meets our needs. (would require TG’s agreement to take higher-level API or have us fork it) Document theory more deeply (like, write down the math) and not worry so much about the current implementation/examples
OpenFF asks (highest to lowest priority) - JM would lead these efforts but we’d want TG’s input Writing down the math (we suspect this is in dissertation, could use some guidance on what TG sees as the most valuable insights for operating-on-SMARTS-strings-as-bit-vectors) Updating an example or two to be more understandable (broken into jupyter notebooks, with prose) TG – Not a fan of jupyter (pain to modify in vi) but examples and tests are PR’able TG – Recommend going over examples material, and think about it from user/developer perspective.
Docs for basic low level concepts (separate to how they’d be used in real life) JM – Basically my major motivation on trying to document low level stuff is to test my understanding/take a reductionist approach. Not because I think low-level docs are the most valuable thing as a product. TG – Might be better and more time-efficient to try and work this top-down. Lots of unnecessary code in there.
Adding some tests (possibly already there) Some new API points for our use cases (eg. rethink API around graph hierarchies, re-implement binary operations on SMARTS strings to be more flexible, maybe some fitting-specific stuff, etc…) Thread safety (very short PR) CamelCase class names
OpenFF wrapper/plugin thoughts Use OpenFF for FF reading(+writing?) Graph codec? JM – I don’t think we want this TG – Might want to reconsider - since I don’t use OE, I don’t have access to aromaticity model - I only use RDKit. So substructure matching behavior might change. JW – I don’t think we’d need OE - We use the MDL aromaticity model which is implemented in both OE and RDK.
Supplying arbitrary charges to a molecule (to enable using NAGL) JM – Could we use clustering’s smiles_assignment to get around this? TG – Yes, or graph_assignment. I might need to provide a way for the user to provide these from arbitrary sources
Some preparation for possible smee interface? Useful places to start studying Example 8 / splits.py Example 11 ff_optimize
|
To dos JM/JW/LW need to discuss priorities WRT whether we want this to fit into an optimization loop (like, just proposing splits, or using more sophisticated strategy), and at what level. JM will run through example 8 and 11 and focus on understanding and documenting major functions (esp. Usage guide’s FF fitting walkthrough, example 8’s use of splits.py, and 11’s use of ff_optimize)
|