Future NAGL section in the SMIRNOFF spec | @Jeffrey Wagner | (Preemptive notes) LW: need to ensure NAGL model and NAGL software tie together, should this be a spec or model constraint? Currently NAGL models are “versioned” akin to the ForceField parameter type versions (or ForceFields?), but it’s less easy to upconvert them
JC – One challenge from my OpenMM experiecne is that the interface/API itself needs versioning. Ex: Is element info and connectivity passed in, what are variable names/orders, and what might be later features that we want to include (ex total spin). LW – Good point, interested in your thoughts on the current situation. The current model files include the featurization, which is implemented by NAGL software. Current input is OpenFF Molecule objects, which are processed by openff-nagl package into the correct featurization. JC – How are molecules turned into correct tensors? … LW – We currently do this in a fixed way - The model file contains both the tensors with shapes and names, AND a list of features that can be ingested by the NAGL software, AND some metadata. The NAGL software interprets the list of features and uses that to featurize the molecule. JC – Do they go into tensors named after the features? LW – Yes JC – Then we want to have a spec saying which features are available. This sounds implicit currently, so we’ll want to make that explicit. LW – I understand what you’re
Straw proposal: Middle ground spec: <MLCharges version="0.3" model_file="openff-gnn-am1bcc-0.1.0-rc.3.pt" ></MLCharges>
Spec will document that allowable features in the model file will be [element, connectivity, resonance averaged formal charge, ] (there are more in current model, and more possible in software, those will be explored) Spec will document meaning of the different dimensions of shape for features and their units. LW – Somewhat concerned that this means every time we add a new element we’ll need to update spec version JC – This might mean that we should think of an encoding that’s invariant to the number of allowed elements. Like a whole-periodic-table encoding. DM – And NAGL (software) can fail when something unimplemented appears.
The spec will document how resonance averaging and other operations are done.
Minimal spec: <NAGLCharges version="0.3" model_file="openff-gnn-am1bcc-0.1.0-rc.3.pt" (maybe NAGL software version) ></NAGLCharges>
|