MT – Not a huge amount of progress on System. Mostly devops/toolkit fires. Cleaning up Toolkit fires. Cleaning up toolkit PRs. Lots of conda work – Taking a lot of time, but learning a lot. DD – Mostly worked on QCA submission lifecycle. Worked with JH on automation and tracking on GH. Will put into production soon. First to submit will be Rowley biaryl set. Updating psi4 used in builds since current prod build is a year old. Focused on helping JS get pAPRika branch merged. SB – Made automated pipeline for ES potentials. Paralellized with multiprocessing. Should be compatible with eventual QCF ESP calculations. Interface is pretty general so it should be able to handle pulling explicit ESPs or e- densities. Recharge can optimize BCCs using matrix formulation, runs quite efficiently. This is performant enough that we can plug it into a bayesian optimizer. Currently done small runs using lightweight libs, could eventually using Pyro. Concerned in the long run about PyTorch’s parallelization schemes, and compatibility with our work. PyTorch assumes homogenous computation at a high level and splits up dataset to different processes early on. This may be trouble if our optimizations are highly interdependent, since then there will be information exchange needed between lots of processes/between distant parts of dataset. MT – I’m also concerned about limitations on cross-talk as we think about different/experimental fitting strategies. We plan for a lot of our future scaling to be smoothly handled by PyTorch’s performant underlying libraries, but it seems like the questions that we’re approaching may be fundamentally incompatible with their performance/parallelization schemes. MT – Does it seem like other ML libraries may handle our problems better?
JW – AMBER FF porting. OE2020 fixes. Speccing CLI tool requests. Should make it possible for CLI tools to take molecule input as STDIN and output as STDOUT. This would be triggered by -i - , where - means “STDIN”. Standard CLI infrastructure? argparse vs. click ? SB – Should study GROMACS tools at well as AmberTools. Should all CLI tools be wrappers around pre-existing functions? Pro: Clear and maintainable Pro: Anything you prototype in the CLI can be implemented (or paralellized!) in Python Con: Hard to find/modify source Con: Behavior is strictly tied to OFFTK release
Should CLIs be accessible through a conda/setup.py entry point (copied into $miniconda/env/bin ) Pro: easy to access Con: Hard to find source
Make separate openff-cli module/package? Lets us have a different release pace, but also have this under our “guarantee of correctness” umbrella. Could be where utils/structure.py goes to If all CLIs live in openff/cli , and this is a Toolkit-centric repo, then where does the evaluator CLI go? Import loops? Testing? Handling missing/optional dependencies? Make sure we do lots of lazy-loading so that we don’t have hard dependencies. This would let us modularize readers/writers from underlying functions, and not need to test all permutations of inputs for all methods separately. There are dedicated CLI testing libraries that we could include.
|