Updates | SB Worked with MT on System object, found some bugs and suggested some PRs Smirnoffee can now do energy evals for entire system object. Can get gradients for the same Synced up with DD on openff-benchmark. JW – I’ll be handlingt he benchmarking refactor in July - Sept. Let me know if there are specific design points to know about. SB – Parallelization, migration to toolkit, more testing
MT Mostly System stuff. SB helped a lot with his review. Met with TH who volunteered to be a guinea pig for System. In 11-12 months he’ll be moving on to nonbonded stuff (right now he’s doing valence terms). He’s very interested in using parameter change propagation in his experiments. System work, not very exciting. Moves openff-utilities and openff-units into openforcefield org, made alpha releases. Openff-utilities is on conda-forge. openff-units is in a PR that requires review from C-F staff. After both of those, I’ll open a C-F PR to make the System package. Openff-units and openff-utils are an effort to reduce code duplication across packages – They have oft-repeated code blocks, in one standard form. Utils is currently pure-python and can have quick turnaround for PRs into packages. OpenFF units is a kind of wrapper around pint. Even if we end up married to simtk units, it will be good to stick with the openff-units package.
IP Focused on perceive_residues – We now have working code to assign residues to SDF input. Handles capped and uncapped residues. Works on the T4 lysozyme in toolkit data. SB – When/under which circumstances will residue perception happen? IP – When a molecule is read from SDF, it won’t have residue info. The user will have to explicitly run perceive_residues on the inputs that they want. SB – OpenMM tries to unconditionally perceive residues, which has been a problem in the past.
Now we’re looking at getting “Standard” residue definition from the CCD (components.cif ). Big hurdle is finding a good MMCIF parser. Made some progress on deterministic RDKit conf gen, by canonically ordering the atoms before conf gen. Found that the 2021 version added some features, and the default behavior changed to do heavy-atom RMSD. So now RDKit generates less conformers than OpenEye. It turns out that the test molecule for the test that should have caught it, in the order that we were inputting it as SMILES, didn’t notice this change. But in a reversed order, it would have caught it. SB – Toolkit has a canonical-atom-ordering method, but it doesn’t return the mapping between the old and new order. It would be good to expand this method to return the canonical ordering. SB – If you canonicalize molecules using OpenEye, make sure that the methods are in sync (their main canonicalize and omega canonicalize can behaave differently)
JW: Beginning work with Andrew Dalke this week. Will initially work on getting a better molecule test set, and documenting assumptions for taking in molecules from RDKit+OpenEye Moving through PRs – Need to do charge rounding + switch_width before next release Flaky? behavior in CI tests Topology work Deterministic conf gen with IP Blog post
PB: LW Working on charges. Built interactive display that lets me iterate over how charges can be applied to fragmented molecules, and how the fragments can be stitched back together JW – How bad are these numbers? LW – On a technical level, this is easy to implement. I’m still looking at how accurate they are.
Working on how to make matching large SMARTS patterns more memory efficient JW – I’m working on other changes to make SMARTS matching more efficient. If we have some options on how to do this, we can try to merge this into the OFF toolkit. SB – Two-stage matching? Do heavy atoms first, and then use those to narrow down the regions that the all-atom search could look at? I’ve done this before in some other projects. SB – Also, it may be worthwhile to bring these cases to the RDKit+OpenEye devs, since their fixes could be used by everyone. TG – How many atoms are in the SMARTS / how much memory does it take?
LW – Made polymeterizer package, currently private. (update in afternoon: Github link macro |
---|
link | https://github.com/lilyminium/polymetrizer |
---|
| )
TG Working on fitting, nothing super interesting on that front. Working on vsites. In the new PR that Matt’s looking at, there’s a new addition that needs to go in. In the chargeincrements, there’s an arbitrary number of charge attributes. The mapping of charges to particles gets mangled by canonical ordering at some point. TG – So, if we have a chargeincrement with indices 0 and 1, then we need to ensure that they get mapped to the JW – Is this about trying to do different things with chemically equivalent atoms? TG – No. Let’s say I have a vsite that tags 5 atoms, but we only have 2 chargeincrements. Then, find_matches puts the atoms in the order of the match. Then we transform the… JW – … is this related to that other thing we did in the design of the vsitehandler …? TG – No TG – I have a proposed solution, but I need to think about it a bit more (General) – This is very complex, TG+SB+MT+JW will meet after sprint planning about this
|
Charge increments on vsites | TG – The problem is that the chargeincrement indices go out of sync with the smarts indices. When I look at it, the only place where the indexing problem is solved is in ChargeIncrementHandler .create_force and VirtualSiteHAndler.create_force In the vectorized parameter PR, we don’t use create_force. So should I duplicate this code? Or could I move itinto something like find_smarts_matches ? When we call label_molecules, it only returns atom indices. But vsites will need to include particle indices. TG – If I have a particle, I need to know how it maps to the charge increment list. So, if I do a virtualsitehandler.find_matches, it gives me specific permutations that I use for orientation. TG – I’d like it to be so that, if I run find_matches, the keys in the returned dictionary correspond to the atom order in the chargeincrement list JW – This sounds reasonable. We could change the behavior of find_matches to change thekeys of this dictionary to be up-to-date before they’re returned. JW – It’s good by me to transform the keys of the matches to be in the correct order for the chargeincrement list (these lines: Github link macro |
---|
link | https://github.com/openforcefield/openff-toolkit/blob/8f97f7431c772e6eb69796a66766633e879a0709/openff/toolkit/typing/engines/smirnoff/parameters.py#L4291-L4295 |
---|
| )
TG – There’s something wrong with this – For chargeincrementhandler, if you have the same SMARTS, but different ordering, is that allowed? TG – ChargeIncrementHandler and VSiteHandler are the only handlers where the number of terms depends on the number of tagged atoms in the SMARTS. MT – Also LibraryCharges. JW – I’d be fine with these three handlers having the results of their find_matches be guaranteed to be in the same order as their indexed attributes. It’d be best for these changes to be in the places where we override find_matches in the specific ParameterHandler classes, instead of in the base ParameterHandler class.
TG – Should this be rolled into the vectorization PR? MT – Should these changes be required for the next release? (General) – What should be the order of PRs? VSite PR (#909) find_matches ordering PR Vectorization PR (#940)
|
Switch-width | JW – I think it should be force.setSwitchingDistance(self.cutoff - self.switch_width) MT – Agree SB – I think we should do BOTH this, and update the SMIRNOFF spec to require the switching function. I think we should NOT merge this PR, since the SMIRNOFF spec doesn’t say enough about how to implement the switching width. So the SMIRNOFF spec needs to include some explicit description of the switching function, and a few other things. MT – Agree. This seems well scoped. JW – We can’t add a new attribute to the 0.3 vdW tag in the SMIRNOFF spec. Adding a new attribute requires a version change. But it would be LESS bad to change the existing definition of the 0.3 spec to give a meaning to the switch_width other than “the switch_width is always ignored and set to 0” MT – It’s not clear that assigning a meaning to it after the fact is worse than always assuming 0. JW – I think we need to change the 0.3 spec to say explciitly how switch_width is interpreted. Given that we need to make this change, I think it’d make the most sense to say “the previous implementation was wrong, the spec meant to say cutoff-switch_width ) MT – Two questions:
|