2022-01-24 Core Developers meeting notes

Participants

@Jeffrey Wagner
@David Dotson
@Matt Thompson
@Diego Nolasco (Deactivated)
@Chapin Cavender
@Pavan Behara

Discussion topics

Item	Notes

Item

Notes

General updates

JW – We’ll be getting back to doing spring planning, though maybe not this week

Individual updates

CC
- Followed up on protein FF meeting on Jan 13, where there was a concern about CONstrains used in qm opimized dipeptide sidechains. JC mentioned that this may lead to artificially high forces, and that REstraints may resolve that. I looked into this and it turns out that the forces only increase by about 25%. And the places where we see high energies/forces on the constrained atoms are high energy regions anyway, which would be downweighted in the fit anyway.
- Constraints on non-driven atoms in TorsionDrives
- Got last remaining sections of LiveCOMS review, doing some work to put them together, will send back to coauthors this week and ask for feedback. Another helpful aspect of this is that it contains a list of biopolymer benchmarks, which we can narrow down to the good part and then I’ll work with SB to put those into Evaluator.
  - JW – Was thinking about timing and effort contributions for this - Large uncertainty. I’d love to stay in touch about implementation so we can stay up to date on requirements.
  - CC – I think we’ll use well-trodden paths like ShiftX, which should make implementation easier and will cause less political friction when comparing with previous work.
MT
- Interchange 0.1.4 release, targets 0.10.x series with small bugfixes, etc.
- Interchange 0.2.0-alpha1 release, targets 0.11.x toolkit
  - Large portions of functionality broken
- Iterated/finalized requirements for Interchange adoption:
  Project Plan: Adoption of Interchange backend as export machinery for ForceField.create_openmm_system
  - Developing infrastructure for regression testing Interchange’s OpenMM export against the Toolkit’s
    GitHub - openforcefield/interchange-regression-testing: Regression testing Interchange's OpenMM export against the OpenFF Toolkit
  - Reference data for v0.10.2 created
  - Interchange code path hung (2+ days) at 9090/9104 processed
  - Worked with JW & VU on TypedMolecule classes to get Interchange’s Foyer layer to use v0.11.x
    Interchange v0.1.x uses an MDTraj’s Topology for everything, v0.2.x will use OpenFF’s Topology class
  - Working through some Sage fitting scripts / Evaluator tutorials to familiarize myself with fitting process. Nothing new to report.
DD
- QCArchive
  - 2 weeks left of storage as of today from Ben; accelerated from previous due to my mistake on pubchem sets 2-5 reported last week, but still climbing at about 30GB a day due to our other sets, too
  - Working with Ben on evasive action to buy us some time as he pursues additional storage options
  - JW – Anything we can do from the position of OpenFF/OMSF?
    - DD – Stand by, they may ask us to help cover costs proportional to additional storage they need to buy
  - JW – To clarify: Dataset deletion isn’t possible?
    - DD – It is possible, but only from BP’s access point. In the next version, deletion will be simpler
  - DD – JC mentioned that QCA could set up a dataset retention policy, that specifies how long datasets will be maintained/what policies are in place for how long they’ll be kept available.
  - JW – Should we shut down SPICE sets?
    - DD – What’s status of SPICE?
    - PB – About 15k optimizations completed over the last week. I had more managers running at UCI last week, but now I’m down to 2.
    - DD – PEastman is also running some on Stanford resources, I think we should let him continue running.
    - DD – I’ll write to PEastman on the qca-dataset-submission PR to tell him the problem, the status, and that we’re shutting down non-stanford compute on the set to buy time.
- PLBenchmarks
  - made a lot of progress on architecture doc, in particular thinking about data entities
    - drawing from both protein-ligand-benchmarks and fah-xchem data entities for inspiration
  - finishing draft out today; would like Jeff's help in giving the project page a similar treatment to the interchange adoption project
  - next step would be establishing working group, with Wagner, Gowers, myself, and Chodera initially (others welcome, will put out a call for stakeholders)
PB
- Most time went into fitting impropers and analyzing the results, overall benchmarks show no degradation of current FF. While looking at individual benchmarks found that number of impropers enumerated by geometric depends on the geometry and that may result in a skewed benchmark (only for the metric improper rmsd, other metrics like ddE, RMSD, TFD won't be affected), where one optimized geometry from forcefield1 may have 'x1' number of impropers and forcefield2 may have 'x2' number of impropers and x1, x2 may not be equal.
  - LPW confirmed that the geometry will affect the number of impropers measured:

https://openforcefieldgroup.slack.com/archives/CE42QMGSW/p1642818436020800?thread_ts=1642811891.019600&cid=CE42QMGSW

Regarding theory benchmark work, last when we met I mentioned that our current default is slightly worse for charged molecules. Since this was from single point energies on dihedral constrained geometries I wanted to check the general performance of the better performing functional/basis set combo. I picked two of them, pw6b95-d3/dzvp (~ 1.1x costly per geometry optimization wrt default), wb97m-d3bj (~ 1.5x costly per geometry optimization wrt default), and tested those on MPCONF196, and still our current default performed better. I checked whether I am not using a tighter grid for the DFT and even after using a tighter grid the results remain the same. So as of now I don't see any smoking gun for change of theory level, have to solicit feedback from Lee-ping, Chris and others if I have to do any additional analysis. Will try to push it this week.
- CC – Does MPCONF contain charged molecules?
  - PB – It doesn’t, I’m looking to extend it.
  - CC – So reason to use MPCONF is because it has a large number of conformers?
  - PB – Yes, 15 confs per molecule, and the molecules are a bit larger.
JW
- Interchange-to-openmm export adoption planning and speccing meeting.
- TypedMolecule crutch
- Some user support for bespokefit, learning about mendeleev pacakge
DN
- It seems like some folks are working on things unrelated to development, and I don’t think that should be happening. For example, DD shouldn’t be solving QCA’s storage problems - That’s a manager’s job. JW is taking a lot of stuff on his docket, and maybe that’s not efficient. So I’d like to find a way to find a new way of working that facilitates getting things done. So I’ve been watching for the three weeks since I started, and I’m thinking about how to make people more productive and happy.
- JW – I’m also interested in a future where we spend less time on taking action - Where we can disburse funds and solve problems quickly without reiterating to several different groups of stakeholders.
- DD – One issue with QCA is that there’s nobody else to manage our interests in it. We are dependent on it, and if it doesn’t deliver what we need then the whole OFF org suffers. So we do throw a lot of human effort in to keep it running - They only have one person at MolSSI working on it.
- JW – DD is doing an incredibly hard job of interfacing between two organizations, and is contracting part-time for both which helps balance the load.
- DN – I see that DD has multiple tasks, and that makes it really easy to get overloaded and stressed. I see one of my jobs as being protecting the team from getting overloaded or unhappy. So when you see something on your to-do list that doesn’t seem important, please say so, and I’ll help push that message upwards and arrive at a solution.
- DD – We kinda have N people and M projects, where M>N. So everyone’s involved in multiple projects. I’m involved in 7 projects between 4 clients. So I agree that we’re spread thin, and that’s the nature of the work. On the QCA side, we’re working on hiring a developer to take on implementation of the next generation of infrastructure, but the hires have stalled. I think the key to changing the current situation is to make N larger, and that’s going very slowly/not at all.
- JW – I think N is hard to increase in practice, reducing M may be the better thing to do
- DN – I’d love to chat with folks from the project team to understand different perspectives.
- DN will reach out to meeting participants to schedule brief one-on-ones and get feedback.

Sprint planning

2022-01-24 Core Developers meeting notes

Participants

Discussion topics

Action items

Decisions