Expected release 8/24
Final tasks:
- Add an “exclusion_policy” to top-level Virtual Site handler
- naming? Policies up to “parents” are implemented
- Remove parameters (InPlaneAngle in divalent vsite) which do not belong/do anything.
- Flesh out the API for vsites (accessing particles, use of public/protected members)
- _add_*_virtual_site from parameter handler
- this seems like the only acceptable case. We are not keen on exposing an API to add virtual sites/modify the topology, so it will be allowed
- Move the typing logic from forcefield to parameters.py::PH._add_parameters (inside the loop)
- Write test cases
- Particle iterators
- virtualsite add parameter (in base ParameterHandler add_parameter; see tests/test_parameters:test_add_parameters)
- Correct positions
- Correct energy
- Validation of offxml input (no angles in BondCharge, etc.)
- Correct hierarchy in selection (last wins) based on names, smirks, and type
- make sure create_openmm_system uses the molecule api to create oMM particles
- make sure particles show up in the oMM when using the molecule API
- Equality operators
- Figure out the xmltodict error?
- Merge vsite branch into master 8/26
- Docstrings of all new functions added
- Other documentation tasks?
- ‘__ all__’ ← put new objects in here
- docs/typing.rst
- In the release notes, add 5-15 sentences which highlight difficult edge cases (that we correctly handle)
- Documentation by 8/31
...
Meeting notes:
We should allow multiple matches for virtualsites, and multiple of the exact same SMIRKS
When initializing a ForceField, should we deduplicate VirtualSites with same SMIRKS, type, angles, distances, chargeincrements?
We should not deduplicate VirtualSites during VirtualSiteType or VirtualSiteHandler initialization
Does OpenMM have a problem if two virtualsites are really close to each other (eg. do they cause an energy explosion?) Under what conditions does OpenMM automatically deduplicate or apply nonbonded exceptions between virtualsites?
Unresolved -- We should check
(2020/04/10 JW edit): Previous context – Recognized that this could be a problem, but didn’t provide a solution
Github link macro link https://github.com/openforcefield/openforcefield/pull/67#issuecomment-356084010
Should we throw an exception if two virtualsites are defined on top of each other (same reference atoms, angles, distances) during system creation?
Yes, but add kwarg to create_openmm_system for `allow_overlapping_virtual_sites`
...
OpenMM Topology doesn't contain VSites, so we don't need to worry about them in
OFFTop.from_openmm
If isomoprhism/deduplication gets complicated, we could remove VSites from the public OFFMol API, and only let them be added during parameterization
Should we provide a function to "label" an OFFTopology with virtualsites? This seems like a good idea, and it'd be most direct to repurpose
create_openmm_system
for this.create_openmm_system
could return BOTH the resulting OMMSystem AND a modified copy of the input OFFTopology. This resulting OFFTopology would have COPIES of reference molecules and vsites with final charges. I'm not sure what would happen with constraints. Returning the resulting OFFTopology should be triggered by a new kwarg,return_topology=False
, which will runreturn system, topology
at the end ofcreate_openmm_system
isTrue
. Should be trivial to add, since we already make a deep copy of the topology at the beginning ofcreate_openmm_system
.When do we want to deduplicate SMARTS matches (by permuting tuples of match indices)? Is it absolutely necessary to add a unique option to virtualsites?
For BondCharge sites, it's clear that we do NOT want to attempt deduplication -- N2 should have symmetrically-applied vsites.
For DivalentLonePairs, deduplication would SOLVE unavoidable duplication issue with TIP4P, but would make TIP5P require two separate vsite SMIRKS (+N and -N degrees). So it seems that DivalentLonePairs should be deduplicated.
For TrivalentLonePairs, deduplication is definitely wanted, since it will otherwise unavoidably add 6 copies of the virtual site for NH3's lone pair. This deduplication might be performed by taking the match index tuple (a,b,c,d), and deduplicating permutations of b, c, and d. This... SHOULD be safe, though I wonder if there are pathological cases that would cause trouble here.
I'm not sure about deduplicating matches between DIFFERENT parameters. SMIRNOFF's "hierarchy" philosophy would make me think that we could add, for example, a "generic" trivalent nitrogen lone pair using
N([*])([*])([*])
, and then define a more specific lone pair for secondary nitrogen centers usingN([C])([C])([*])
further down in the list, expecting the second to override the first when it applies. However, there are also other situations where I'd expect virtualsites to be additive. So, for now, the best way forward is probably to NOT attempt deduplication between different parameters, and just caution users to craft exclusive SMARTS if they want them to apply exclusively.
Immediate action items:
- Deduplication strategy
As the items indicate, deduplication depends on the vsite definition. Therefore the matching needs to include a parameter of whether to transform the dictionary keys. I have already massaged this to work, so it is close to being done. It appears difference exists whether the vsite is in the mirror plane or not. BondCharge and Monovalent have planes of symmetry between atoms, and so placing vsites on one side must be mirrored. where Trivalent and Divalent do not have symmetry between atoms, and the vsite does not need to be mirrored. Arbitrary vsites, such as a COM, are assumed to have c1 symmetry and don’t need to be mirrored (dedupe = True where all permutations are collapsed).
...
The normal parameters follow a hierarchy, where the most specific pattern matches first. Likely, this approach should be similar. Choose the parameter with the least number of wildcards (as simple as smirks str parsing for *, then breaking ties with ~). However, Jeff is in the mindset that vsites can be stacked, even if they are not unique. This will fail the above requirements for the above uniqueness strategy.
Matching procedure (proposed):
Collect all sites which match the smirks, wildcard or otherwise. For e.g. BondCharge, do NOT deduplicate.
Sort based on who “:1” is (separate the intent), this is two distinct set of vsites for steps 3-5
Sort based on name
Determine which sites are the most specialized (e.g. sort by atomic number of “:1”)
Determine uniqueness (make sure none have const r=0) # Can it be removed?
Apply the surviving sites (maybe the ones at the end of the list)
...
[*]-Br : will * always be *:1?
Misc. Notes:
Want to avoid patterns where we have explicit matching and anti-matching, e.g. C-Br and C-!Br since specializing wildcards in this scheme will be hard to maintain.
...
OpenMM adds an exclusion to the vsite and the [arbitrarily chosen] first atom in the match. This means the forces between the “owning” atom and the vsite are not included. (Does this mean pushing on the vsite pushes the owning atom?)
Use cases:
Virtual Site specialization
...
[Br:1]-[C:2] == [Br]-[C] == [Br:1]-[C] == [Br]-[C:2] and all wildcard incantations matching the corresponding atoms
Dangling items and notes:
5/08
“Index” formalism with respect to virtualsite atom indexing
...