Infrastructure Architecture

Infrastructure architecture planning and long-term decision making

Proposals

Contributor	Date	Proposal	Comments / Feedback
Simon Boothroyd	12 Feb 2020	Before I finalize everything with releasing the re-branded OpenFF Evaluator framework and commit to the new API naming conventions, I wanted to suggest we should invest some time to cleanup the software stack offered by OpenFF.While everything exists under the same GitHub org, there is almost no consistency between our packages. This will only get worse over time, and equally, will only get much harder to reverse as the user-base expands.i.e currently we have from evaluator import ... from openforcefield import ... from cmiles import ... ... while it would be much more cohesive to have an overall architecture similar to from openff.evaluator import ... from openff.toolkit import ... from openff.fractal import ... ... In practice this seems obtainable through an implicit namespace file structure like https://packaging.python.org/guides/packaging-namespace-packages/#native-namespace-packages while still maintaining individual repositories. This style of architecture / design would seem to lend itself to creating smaller, more focused repo's / packages (similar to more of a set of software 'microservices').I understand this would initially cause a large amount of disruption and possible confusion among users, but the end result would be a cohesive, elegant stack, with all the software we build being connected and identifiable under the same umbrella. Moreover, I believe it would push us to build software which more rigidly follows a single responsibility pattern, rather than monolithic packages which 'do everything' which the toolkit seems to be heading towards (especially if it simply just absorbs things like fragmenter and the QC submission frameworks). It would be fantastic to start moving away from a style similar to a zip file of disconnected tools, and to start planning longer term about how we want our software to look and be interacted with. Originally posted in Slack	Karmen Condic-Jurkic: Matt Thompson could be potentially used for this task. Jeffrey Wagner Simon Boothroyd : We should plan to do this at the May hackathon – Let’s spec out exactly what the namespace will look like and which packages go where
Jeffrey Wagner Joshua Horton Jaime Rodríguez-Guerra (Deactivated)	14 Feb 2020	We should add a LRU cache to `ToolkitWrapper` (or `ToolkitRegistry` ) to record the outputs of common time-consuming processes like `to_smiles`, `compute_partial_charges` , `find_smarts_matches`, and `assign_partial_bond_orders`, which maps all of the inputs to these functions (so, the molecule graph hash, conformers if applicable, other kwargs) to a cached result. From Jaime Rodríguez-Guerra (Deactivated) – Python library for this https://cachetools.readthedocs.io/en/stable/ Link to context: https://openforcefieldgroup.slack.com/archives/C8NE3J96U/p1581697458078600
Jeffrey Wagner		The OFFTK’s reuse of Python built-in exceptions, and its try/except logic for handling external toolkits is dangerous and causes ambiguity that can obfuscate the real source of problems. We should make a new file containing a new Exception hierarchy that inherits only from Python’s `Exception` at the very root of the tree, and is differentiated everywhere else.
Jeffrey Wagner		OFFTK 1.0 release major refactors Remove aromaticity setters, wire up consistently-enforced aromaticity percievers All charge-assignment methods become sub-entries of the `Electrostatics` tag, possibly the same for bespoke vdW parametrization Resolve `ToolkitWrapper.from_object`'s use of private `FrozenMolecule._add_atom` method