2022-01-10 Core Developers meeting notes

Participants

@Jeffrey Wagner
@Pavan Behara
@David Dotson
@Chapin Cavender
@Jeffry Setiadi
@Simon Boothroyd
@Matt Thompson

Discussion topics

Item	Notes

Item

Notes

General updates

JW – Diego (Project manager) should be online now, would be working Brazil time. Likely focusing on OpenFE first.

Individual updates

SB
- Mainly working on debugging issues with FE calcs using a different functional form. The custom nonbonded force we’re working on has a default softcore functionality. So you shouldn’t need to modify it to do FE calcs. But it turns out that the soft core potential insde the potential is more like a hard core in terms of the problems that come up. So I’m working with JHorton to figure out workarounds.
  - JW – What’s the distinction between the X-core func forms?
  - SB – I use it to mean “when you’re turning off interactions, and molecules need to semi-overlap, the LJ terms need to go to a very small number smoothly”. If you don’t satisfy this well, you end up getting poor sampling of certain states. It’s possible that modifying some global parameters could help here too, so we’re still looking into it.
  - JS – Double expoential form: Is that similar to buckingham potential?
  - SB – I think so - In buckingham you replace the r^12 term with an exponential. So here we’re experimenting with replacing both r^12 and r^6. By replacing r^6 we get nicer behavior at short range.
- PB – When you have time, could you check the PR on plotmol?
  - SB – Will do, I’ll have a look.
- Building training sets for refitting charge models - I’m putting together a set of molecules that we can compute electrostatic potentials for - Talked about how to put this together last week, decided that we’ll generate conformers and optimize offline, then submit to QCF to generate and store the wavefunctions online.
  - AM1BCC is interesting since it has a lot of odd parameters, covers exotic chemistry. So I’m making some tailor-made molecules and adding them to this set to exercise/train these parameters
- I uploaded a tool called splore to conda for investigating molecule datasets
CC
- Made a new version of the dipeptide torsiondrive set. Goal was to expand number of sidechains. There’s one torsiondrive running for all 20 amino acids, and 6 protonation states. Saw one comment by PB about renaming one file from that PR, otehr than that it should eb good to go
  - Previous version of this set is complete (6 AAs). Will share results on biopolymer call on Thursday
- Will start playing around with fits in forcebalance. Will plan to mimic the terms in amber FF14SB initially.
  - SB – More details? As in functional forms, or parameter smirks?
  - CC – The fitting model - We want to fit terms for phi and psi, but we don’t know whether those should be shared between similar amino acids.
  - SB – That makes sense. So this would start with the null model and see whether groupings provide an advantage?
  - CC – Yes.
- Starting to work on deeper review of paper, provide first round of feedback.
- Talked a few weeks ago about deriving charges for ELF10 - OE COOH issue. Is this still in progress?
  - JW – I assigned this to JMitchell but he’s been offline for a few weeks.
  - CC – I’d use this for deriving AM1BCC charges for making the protein librarycharges. This will become blocking in a ~month.
  - SB – OE may have fixed this internally ina recent release? Alternatively, CBayly should have code for a workaround for this somewhere in the issue tracker.
MT
- General update - NEP29 recommended moving projects to Python 3.8+ on Dec 26 2021. I’ve updated my packages, but if anyone needs 3.7 support I may be able to support it but you’ll need to tell me.
- Was getting back into the swing of things last week. Tried to fix an entry points/plugin thing that JChodera pointed out, once that’s resolved we can release OFFTK 0.10.2. New interchange release will come with OFFTK 0.10.2
- Been working on improving speed for export to large files. Won’t aim to be as fast as ParmEd initially, but working on reducing the time to write a gro file down from 10 minutes to something reasonable.
- Started identifying which packages will need to be updated for 0.11.0 release w/ topology refactor. We’ll be opening a bunch of PRs once the release candidate toolkit package is available – Largely due to topology refactor, but openff-units will also require changes. I took responsiiblity for evaluator and forcebalance, JW has responsibility for others.
DD
- QCArchive
  - Pavan has the conn for qca-dataset-submission; he is the captain for submission lifecycle, priority, compute tag routing
  - we had some headaches from SPICE calculations requiring high memory, high disk
    - needed to deploy a small number of really large nodes on PRP; might need to operate in large node, low-parallelism for a while
    - We identified that these aren’t as high priority as internal datasets, so we can deprioritize these in facor of dipepdtides, etc
    - PB – I actually started 40 workers for this dataset over the weekend - Assigned lots of processors, RAM, scratch space. Saw some failures for gen2 optimization set with enumerated protomers.
    - DD – Great - I’ll let you know if I notice anything particular for the behavior of these sets/compute workers.
    - PB – I also shared some info with JChodera about the progress and prioritization of the jobs, posted on GitHub.
    - JW – To clarify: Is this set erroring out, or not being assigned to compute since we have higher priority compute in the queue?
    - DD – The latter
    - JW – Great. I agree that this should be lower priority than our internal sets.
  - getting dipeptide set from Chapin submitted today; will spin up substantial resources on PRP for this
    - CC – I’m running two managers on TSCC for this at the moment.
  - identified with Ben a reasonable approach for fixing QCFractal manager signal handling; aiming to get PRs ready for review this week
  - working with Ben this week to test out production-level calculations on managers against refactored QCFractal server
- PLBenchmarks
  - engaging Jeff and Richard Gowers as stakeholders and feedback loop for this effort; this will help hold me accountable and accelerate progress
- PB – If I wanted to move a dataset to “complete” on the QCA dataset repo, but it has consistent errors, how would I do this?
  - DD – For example, on the dataset with 9 calculations that are errored out, you could give it a tag “openff-defunct” to ensure it never gets taken up by general compute. You could also move the card out of “error cycling” on the project board. So I’d move it to etiher “end of life” or “complete/archived”. (we may want to revisit our workflow/stages now that we have more experience around this)
  - PB – That makes sense. Thanks!
JS
- Working on forcebalance fitting to GB parameters - Getting good results. Will be doing more testing, will present at FF release call in the future. Suggestions for protein ligand benchmark sets?
- JS – Thinking of making an openmm plugin, does anyone here have experience?
  - JW – No, but we have monthly meetings with PEastman, next one will be Friday at 11 AM pacific
PB
- Theory benchmark work, some analysis on the subsets and it looks like our current default is not as good with charged molecules, will be testing out a potential replacement this week. Althoug out of scope, I tried to test the DeepMind21 functional, didn't work right out of the box, need to get familiar with pyscf first (initial guess, convergence criteria, etc.) to be on par with our current psi4 calculations.
- Did some fits with impropers for Jessica's work.
- Spent some time on compiling and testing xtb's fix for GFN-FF but it didn't work for me. I posted my test but the developer closed it without any response, assuming it works now.
  - JW – It looks like the dev thinks that the best way to test the fix is to put it in the next release, since it’s so hard to test from a local branch.
- Some QCA work following up on discussions with JC.
JW –
- I’ll be out for jury duty tomorrow
- Continuing to work on topology refactor -
  - getting rdkit specific code replicated by OE and into appropriate classes
  - ensuring that residue data goes to/from openmm/rdkit/OE in a sane, documented way
  - Turning from_pdb into a "one stop shop" that can handle a mix of small molecules and biopolymers
  - Making errors more informative to help users with debugging
- Updating roadmap items to each have at least a primary driver and primary stakeholder/approver, and for items that are already in progress, some form of a spec for what the minimum product is and what stage are anticipated in its development
- The preferred python plugin/entry point system recently had a small change, but adjusting to the new system required the redesign of our tests for parameterhandler plugins (OFFTK #1163). This is the last blocker to the 0.10.2 release
- Discussing requirements for 0.11.0 and how to incorporate Interchange. Big discussion on thread in #developers.
  - DD – Does the toolkit use interchange or does interchange use the toolkit?
  - JW – It’s mostly that the toolkit uses interchange
  - MT – Agree, this only gets a little complex when interchange uses toolkit’s parameterhandler’s find_matches method.
  - DD – Ok, so it kinda goes both ways.
  - MT – Yeah, it’d be super hard to fully make the dependency unidirectional.

2022-01-10 Core Developers meeting notes

Participants

Discussion topics

Action items

Decisions