2021-05-10 Core Developers meeting notes

Participants

@Jeffrey Wagner
@David Dotson
@Iván Pulido
@Trevor Gokey
@Pavan Behara
@Simon Boothroyd
@Matt Thompson

Discussion topics

Item	Notes

Item

Notes

Updates

SB

Fit different variations on Sage beta. Some made performance worse (eg changing valence terms), but now we have a candidate that seems to eba general improvement
Been working with JH to use bespokefit to do whole-FF fitting. BIg new features have included better QCA access methods
Got some conformer minimizations working using AT+GeomeTRIC. This lets us do eg. sqm-powered AM1 minimizations, and detect proton transfer
Have prototype of using pytorch to get FF energies and some derivatives (like coordinates/dE/dx/force, but still can’t get parameter derivatives)
- SB – Can get dE/dFF_param, but there’s an unknown mapping between FF params and assigned parameters. Would need a assignment matrix, but this would be cumbersome since it’s memory-intensive. Could also use vector jacobian products, but this doesn’t work super well with pytorch – There would be problems with backpropagating over assignment step.
- SB – So, can do energy WRT some delta.
- TG – What’s the third dimension in assignment matrix?
  - SB – It’s semi-artificial. In a FF fit, there will bea flat 1D vector of things that are being optimized. But in a real context we’ll need to do something like a n_params x (k, length) matrix for input.
PB – Does performance improvement for sage, is it all related to vibrational frequencies?
- SB – If you don’t fit vibfreq, nothing gets worse… In 1.2.0 we fitted to lots of things, including vibfreq. The lim-hahn benchmark got BETTER in some respects without vibfreq fitting, and didn’t get WORSE in any ways.
- SB – I’ve done PL energies against TIG2
- JW – possibly due to difference in precise vs. isotope-abundance-average mass between QCA vs OFF?
- SB – Seems unlikely, but possible. Also possible that there are bugs/corner cases in off-fb or fb.

MT

Basically stuck working on nonbonded in system obj, other engines, OFF toolkit.
I’ll start making releases of the System object, no clear path to moving this into beta/rc.
Will start updating Toolkit to be able to produce System using private methods. May live in a branch for some time.
VU collaboration check in – We’re mostly decoupled now, so for the most part we’re not blocking each other.
Ported conformer energy example from OFF Toolkit to System. Basically just using this to compare in energy examples. In the long run I’d like to start making System-based equivalent functionality.
SB looked over repo, reported some problems with deps+bugs, working on fixing these.
- SB – Thanks for being so responsive on those – I’m happy to keep this up. Would you be up to have a brief working session on eg. pydantic use?
- MT – That sounds great to me.
Would like to have OFF people start experimenting with System alpha. It’s probably not going to be a great tool for stable, bug-free work, so the effort into this may be a little wasted.
- JW – IP and I are working on residue bookkeeping, would like to make sure that it meets needs for your exporters
- IP – Also some questions about TypedAtoms – Basically, are both mass and element required? Or just one/the other?

IP

Worked with LW and JW on topology refactor architecture. Had to decide whether hierarchy information should live on Molecule, Topology, TopologyMolecules, or some combination of them. Decided to have them live on Molecule, and have methods for accessing them from Topology.
- JW – Basically, this kills TopologyMolecules. Now a Topology will have explicit copies of each molecule in it.
Worked on tests of new API points, based on this decision.
Worked on design of “non-cheminformatics” objects like TypedMolecule, TypedAtom, etc.
Found an issue with ownership, where TopologyMolecule can exist out-of-sync with an owning Topology, Atom can exist out-of-sync with an owning Molecule.

DD

Spent a lot of time on new QCPortal client for QCA – Missed last core devs
Partner benchmark
- Have results from 3/10 partners. Two more coming in soon (Bayer+Janssen)
openff-gopt
- This is intended to be a local executor for optimizations and torsiondrives. Most of this logic was built as part of openff-benchmark, but it’s useful enough that we’re making it an independent tool.
- Should help with bespokefit, since it needs to be able to run these jobs, but it’s kinda overkill to use a whole QCF server locally.
- There’s disagreement on the name.
QCArchive submissions
- Reviewed Jang’s gen3 torsiondrives. It looks good to go. Tagged SB for approval.
- Still need to do standards v3 implementation
- Refactored QCSubmit’s submission machinery – Reduced number of different methods that handle different sorts of datasets.
PLBenchmarks
- Not moving too quickly right now. Working on scoping design currently. First I’ll work on reproducing DH’s outputs (likely including Lorenzo D’Amore)
- Once this is scoped+reproduced, I’ll work on getting this aligned with F@H infrastructure and needs.
PB – IIRC, there was a meeting with BP about missing torsion optimization records. What was the result of that?
- DD – There are about 30 torsiondrives that have this problem. The problem is that these torsiondrives are missing some optimizations. BP has a way to manually fix these, and he’ll go through these 30 and manually fix them.
- Notes: 2021-05-06 QCArchive - Missing Optimizations / Duplicate TorsionDrive troubleshooting meeting notes

JW

Worked with IP and LW on topology refactor design, achrtecture, and tests
Worked with Jang to debug a problem in OFFTK and QCSubmit. Turned out to be due to atom maps.
Learned about PME. It’s confusing and the SMIRNOFF spec needs to be changed.
Swope found problem where loading a molecule using from_mapped_smiles or from_smiles would lead to different partial charges. SB found the root cause.
- SB – Should do a post-minimization geometry check. If the connectivity changes we should EITHER throw and error OR do AM1BCC without geometry optimization.
- JW – Agree. Will take a look at the the code you mentioned earlier.
  - (General) – SQM code can output a final PDB, then use QCElemental to guess connectivity and detect changes.
- Should do two things to improve consistency
  - Canonically order atoms before conf gen so that ther’es at least consistency for the same mol speciaes
  - Figure out how to add cartesian restraints to AM1 optimization using AT
    - SB – IC restraints if possible. Could use geomeTRIC
SB – We could use connectivity rearrangement checks in openff-benchmark to check for proton transfers
- JW – Most of this data is already generated, so we can’t change it at this point
- SB – This is an implementation detail, not necessarily a property of the FF. A summary of this problem should at least be included in the paper.

PB

Worked on QM theory benchmark. Another round of lit review and seeing datasets that were used for previous decisions
Talked to VL to hear about previous efforts
Studied methods that are appropriate for systems with charges
Worked on sulfonamide issue, trying to figure otu why FF parameters were changing the way they were. Particularly interested in cyclic and acyclic molecules to see how they affect the fits.
- The OSO angle is very high, at like 120 degrees. The rest of the angles involving sulfur are around 100

TG

Fitting stuff: Found an interesting result where I split a parameter (eg torsion) and do an optimizaiton. Then I look to delete a parameter and it decides to delete the new torsion, and the objective function drops below what it was originally.
- SB – Sounds like a bumpy optimization landscape, where having the second torsion lets the first get over a barrier.
- TG: Rephrased – Let’s say we have a FF, and the SMARTS optimizer determines that the score can get better by making a parameter split. Then, after one round of numerical optimization, the same SMARTS optimizer looks and decides to delete one of the parameters.
FF vectorization code: Take a FF object and distinguished the FF terms – It distinguishes between modifiable parameterattributes (so YES k, length, but NO smirks). Then it makes a mapping of applied parameters for each molecule to the original FF parameterattributes.
- Would like to meet up with MT to see if this has already been done in an equivalent way
- TG, SB, MT, and JW will have a sync-up in 30 minutes on vectorization methods+needs
  - JW – Related to canonical ordering of FF values?
  - TG – Kinda, this lets those be reordered, but it would be helpful to have a default ordering. Also some additional flexibility needed to only record “active” terms.

OpenFF-Gopt name

openff-geometry-optmize?
SB will open an issue on openff-gopt to discuss a new name.

Sprint planning at 5 past the hour

2021-05-10 Core Developers meeting notes

Participants

Discussion topics

Action items

Decisions