2022-03-14 Core Developers meeting notes

Participants

@Matt Thompson
@David Dotson
@Diego Nolasco (Deactivated)
@Chapin Cavender
@Jeffrey Wagner
@Simon Boothroyd

Discussion topics

Item	Notes

Item

Notes

General Updates

QCA storage running out?
- DN – Is this worth discussing with the governing board? I know last month’s gboard meeting ended up with MShirts looking at applying for an equipment grant
- DD – BPritchard directly deleted a bunch of unneeded info from the database. This was pretty dangerous but seems to have worked. This brought our storage utilization down from >90% to ~50%.
- DN – How much more time does this buy us?
- JW – If we wanted to fill it up, it could be done in ~1 month. But we’re going to be careful about what we allow people to submit. So we should have enough storage for maybe 6-12 months.
- DD – Yes, and MolSSI is having an internal discussion about getting more compute resources using their own money, independent of a grant supplement. This is still early in the planning process.
DN – Other gov board updates?
- Update on Toolkit, bespokefit, Interchange?
  - MT – Nothing worth announcing with Interchange
- DN – Cancel gov board meeting?
  - JW – I think it’d be fine to have a short meeting.

Individual updates

SB
- Coordination around bespokefit. Want to get into at least one industry partner’s hands this week. Once they crash test it a bit, then we’ll announce broader availability. So JH and I are doing a lot of polishing/cleanup. Waiting on a PR to QCEngine to allow for parallelizing torsiondrives over multiple processes. This is handy for single-core methods like ANI and XTB. Looks like DD has the PR merged and will help push a release. So we’re going to add this update before we pass it to industry partners.
- Bespokefit now has a full suite of docs, examples, and a CLI. I’d recommend you all check it out.
- I’m getting openff-recharge stable + user-ready for vsite fitting. So now there’s more user docs, no longer a hard OE dependency. So now there’s 0.1.0 released, which can be used for things like RESP fitting, vsite fitting.
- QM sets are proceeding nicely. Getting data for vsite and GCN fitting.
  - JW – Very excited to get this integrated into toolkit, would let us do some really cool things with Rosemary.
  - SB – I’m optimistic that I can leave this in a really good place before I head out!
DN
- (Planning for gov board meeting)
- Showed KCJ my preparation for strategic planning. KCJ recommended that we have separate strategic plans for OFF and OFE. I thought about it and I disagree - They should be seen as strategic initiatives of the same organization. I’ll speak with KCJ again this week and update you about the outcome.
- SB – Could be good to write down some sort of process guide/definitions of key terms in OpenFF operations. This could be handy for when new people come and go, and when PIs return from busy periods and need to quickly gain an understand of what’s happened and what the plans are.
- DN – That makes sense. But at the same time, we’re an organization that’s tryi1ng to deliver science+technology products. We should be able to combine these to create a sense of momentum. But when PIs are involved, they’re not used to thinking about delivery, they’re more frequently thinking about achieving perfection.
- SB – I get the sense that the science side and the PIs are getting disconnected from the infrastructure, like there are questions about what’s available and whether different packages are ready for use and what they’re suited for.
- JW – I’d love to have an up-to-date WBS for most projects that we can update every month for the all-hands meeting. I think there’s also a tricky situation where we’re dealing with highly detailed concepts that require a lot of background/assumptions using limited language, and it’s fraught with misinterpreteations and requires a lot of precessing power to have conversations
- MT – I think that some fraction of our work will always be on the “bleeding edge”, and I’d like to make sure that we have a place where people can come for updates and to gain an understanding. Because ti could become very expensive to continually spend human time to discuss exactly what’s happening on the bleeding edge.
- SB – Yeah, we do have kind of a high cost to this communication. I think that having better docs/reference materials for our capabilities would save a lot of time and effort.
- DN – Could we emphasize some part of our documentation to accomplish this?
- MT – I’m not sure that having a document that answers questions like “can I do X?” would actually intercept these problems. I’d be willing to start such a document and spend ~30 minutes per month updating it.
  - SB – We could mention this to Josh Mitchell. He may have an idea for how we could make a single page that answers these common questions
- CC – Could try to enforce attendance for all-hands meetings, since that does seem to be the central place for org-wide updates.
  - JW – I think we do well with the meeting recordings and ntoes. It’d be great if they all attended but I don’t think that’s realistic. But we should push to keep the all-hands meetings high quality and have their content be broad project updates, and DN’s WBSes could be a passive source of project info.
  - SB –
CC
- CC – Some ambiguity about biopolymer FF project. I think we could use some more structure. The Interchange and F@H project seem to handle this well, so I’ll be working with JW and DN this week to better define decisionmaking procedures.
  - JW – I’m interested in helping get this started
  - DN – This group will be able to explicitly reject proposed experiments/additional work. So this should be a good way to contain the scope of the work for this.
  - CC – I’ll attend the DN/JW check in on wednesday to get this started.
- CC – Some of the biopolymer jobs on QCA has been bursty, and largely waiting on error cycling. Many of the structures that are having problems have internal hbonds. I wonder if this has something to do with SCF convergence.
  - SB – QCA is kinda awkward on this front. When a job dies without an error message, it’s unclear whether the worker died due to a memory limit, time limit, random shutdown, worker pre-emption… DD has helped a lot when I get near the end of a dataset by setting error cycling frequency higher for my nearly-done sets.
  - DD – I can crank up the error cycling on this dataset to 1/hour
  - CC – That would be great. Thanks!
- CC – Thinking more about which benchmark to do for protein FF. Currently looks like we’ll use F@H. Wondering how to do that because benchmarks in literature are run in a small number of very long trajectories, whereas F@H will give us a lot of short trajectories
  - DD – F@H can do long trajectories, it’ll just do them split up (“end to end”), so it may take a while to get the long traj.
- CC – Also working on total cost estimate
- SB – Do we already have a green light from F@H to get this compute time?
  - CC – We opened an issue on the fah-alchemy repo
  - DD – I think SB was asking about “are we allowed to run on F@H?”. I’m not sure what the answer is here.
  - SB – That’s what I thought. It’d be good to get agreement in advance about whether we CAN run those calcs. As a fallback it’d be good to know how we could get this done on conventional GPUs.
  - DD – I’m not sure
  - JW – I’m not sure
  - DD – Let’s bring this up at tomorrow’s F@H meeting. Basically,, do we know that our limits for submission will be, and how much competition will we encounter? What’s process for onboarding new project? What bureaucratic hurdles do we anticipate? I’d expect that GBowman or VVoelz would be able to give us a binding answer to this from the F@H side.
  - JW + DD – We should make part of the project plan include submission/decision authority - Basically, we need to have a mechanism to actually get our work into the F@H work queue.
    - DD will add this as a topic for tomorrow’s agenda
MT
- ForceBalance v1.9.3 release
- Interchange now supports charge_from_molecules and partial_bond_orders_from_molecules
- Interchange.from_smirnoff now accepts a list of molecules
  - Internally Topology.from_molecules is called, simply saves users an intermediate step
- Josh Mitchell starting to revamp Interchange examples
- Experimented with a non-harmonic angle potential, may be a stepping stone to a plugin interface
  - Not a focus now - more of a diversion that I should have pushed into the future
- SB – An odd feature request - When we’re doing fitting, we’re often not interested in knowing which parameters got applied to “molecule 1”, instead I’m interested in knowing “which parameters were applied to this training set, in which order…”
  - MT – That sounds reasonable. It could be specced a little more. You said “vectorized view of molecules” - I generally do one vectorized view for each “handler” - So like one vectorized view in the current structure is for LJ, and a different one for bonds. Would you be interested in having them in the same view?
  - SB – Maybe, I’m still thinking through this. I’ll continue this discussion when I have more of a clear idea. Just wanted to give a heads up that it’s on my mind.
DD
- Protein-Ligand Benchmarks
  - working group meeting this week covered additional user stories; we will finish out coverage of user stories at this week's meeting
  - working on data model proposal; presenting this initial work tomorrow
- QCArchive
  - ESP set nearly complete; starting up rapid error cycling to get last 27 through
  - Protein capped set is hitting a ton of errors; should discuss tomorrow in user group meeting
  - burning quickly through remaining SPICE sets
  - new QCEngine release coming today; includes PR from Josh, Simon in support of openff-bespokefit
- Partner Benchmark
  - reviewed manuscript draft; delivered feedback to Lorenzo
JW -
- Lots of PR reviews, issue chatter
- Next, will finish vsite tests and push for biopolymer release

2022-03-14 Core Developers meeting notes

Participants

Discussion topics

Action items

Decisions