2021-05-10 Parameter vectorization meeting notes

Participants

  • @Matt Thompson

  • @Trevor Gokey

  • @Simon Boothroyd

  • @Jeffrey Wagner

Discussion topics

Notes

Notes

  • TG – I iterate over all the ParameterAttributes of a ParameterHandler and ParameterType. Then for each ParameterHandler and its descendant types, I keep a canonical ordering of all the terms. The terms are hierarchically stored such that they can be accessed directly with a new API point that I made.

  • TG – These unique terms become the columns of the matrix.

  • MT – I basically have something really similar to this, except that it’s split out by handler. The difference may be that this implementation has more of a tree.

  • (General) – Development model?

    • JW – I think it’ll be good to put this in the OFF System, and then make System accessible through private API calls in OFFTK.

  • SB – Gerenally we need some similar stuff

    • API point to look up parameters based on an address – Also standardizing on what this address looks like.

    • Getting a big long list of vectors. A FF is a tree, and TG’s implementation here is a list of paths to leaf nodes. I’d advocate for this “big long list of leaf addresses” be accessible from the FF class, even if it’s just private to begin with

    • MT stores parameters in something that looks like TG’s lookups. The two are structurally different though, so we should standardize on how these keys are represented. I use a third representation that’s kind of a merge of MT’s and TG’s, which I call a “composite key”.

      • MT – TG has a nice way to look at the parameters as they exist in the FF. I have similar functionality, but it focuses on looking up applied parameters.

      • SB – I’d like to focus more on an explicit parameter “address” rather than relying on order in a particular list.

      • MT – There are some cases where this can be difficult, eg parameter interpolation. On a parameterized molecule, this query would make sense, but in a FF, this wouldn’t make sense.

      • SB – Agree. So not all queries that would work on a System would work on a ForceField.

      • MT – So, would these things be merged? Like, if we query a FF for an address, and then we query a System for the same address, should the same data model be returned?

        • SB – Not necessarily. This wouldn’t be a requirement.

    • …(another point that JW interrupted)

  • SB – What should the addresses look like?

    • SB – Currently mine are a tuple of (handler name, SMIRKS, attribute (eg k, length, angle)

    • TG – This works until we get to VSites

    • SB – MT’s current implementation is (handler name, SMIRKS, term multiplicity, attribute). This is great because we don’t need to worry about the presence of k1, k2, etc.

    • MT – Thanks

    • SB – We’re still going to have trouble with vsites. TG included vsite (type, name, and smirks) – eg (BondChargeVsite, “alice”, '[C:1][C:2]) and (BondChargeVsite, “bob”, '[C:1][C:2])

    • TG – This has required changing the behavior of SMARTS matching specifically for vsites

    • SB – Handler name is a still bit tricky – Redundant with vsite type? I think so.

    • SB – There’s also a conflict with term multiplicity and vsite name

    • TG – So, you’re prioritizing having the addresses have the same shape?

    • SB – Yes, this will help ensure that we’re forward-compatible with database applications

    • JW – Maybe mult should be a tuple? Could be more generally futureproof

    • MT – Yes, this was my plan all along. Multiplicity was a misnomer and should be renamed. I’d be happy to accept some sort of flexible spec for this.

    • SB – Maybe multterm_id? Could open a PR to track this. If we’re all happy with the idea of composite keys then we can continue on this.

    • TG – Was going to wait until the vsite PR gets merged.

    • JW – I’m fine with adding this as long as it’s private for now. MT can be reviewer+approver for these PRs.

    • MT – I can do that.

  • TG – If ID was mandatory and unique for every term, then that would work the same way.

    • JW – I’d be fine with this, it’d just require spec changes.

  • TG – How should we proceed with these PRs?

    • MT – Is this interdependent on the vsite PR?

      • TG – No

    • Decisions:

      • JW will review vsite PR,

      • MT will review parameter address PR.

      • TG will immediately move to open parameter address PR.

      • All parameter address stuff will be private initially.

  • Removing Molecule.add_X_vsite from public API.

    • (General) – No objections

    • JW will add a deprecation warning to all Molecule.add_X_vsite methods, and aim to remove from public API in ~2-3 months

  • TG – What is the intent for support of custom vsites?

    • TG – Specifically “weights”. These set the “origin” of the vsite. They make the most sense for localcoordindatessites.

    • JW – What would it look like to make a custom vsite now?

    • TG – Would need to implement a new Molecule, which returns the weights and offsets in the property objects. You also need to implement the type in the vsite handler.

    • (Looked at current implementation) (General) – The current implementation of VSites and subclassing looks good, and mostly publicly accessible. When the Danny Cole lab / Josh Horton needs a custom subclass, TG can make the writeup for how to do this (and the consumer of the info can specify how they need it written)



Action items

Decisions