/
2025-01-30 JAC/LW Check-in

2025-01-30 JAC/LW Check-in

Participants

  • @Jennifer A Clark

  • @Lily Wang

Goals

  •  

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

NRP Issues

 

  • JAC: Ben has captured the error and isolated it to a connection error when credentials are refreshed. He has added a patch that will simply try again (and succeed) in the case of a remote connection error. It is available on GitHub and will be in a release of QCFractal very soon (tomorrow?)


Dataset Updates



JAC Update

  • PR421 Lexie’s small molecule dataset finished except for 5 with Iodine which isn’t supported for MBIS charges

  • Running PR422 Lexie’s larger MW dataset, two didn’t finish before, I thought because of the connectivity errors.

  • Lipd MAPS running with low number of pods to see if any complete. So far no restarts but no completions either.

  • There is still a Lipid Opts Benchmark, submit? or wait for QCF update? Are all new datasets my responsibility?

    • LW: Yes, but maybe if we are going to update QCF it doesn’t make sense to

    • JAC: I’ll submit today, and if no progress is made by tomorrow I’ll shut down deployments

Kubernetes Python API

 

JAC: I’ve written some nice scripts in notebooks for Kubernetes API. If I have many deployments I’d like to take some time to clean it up into a package to monitor them more easily. How can I work this into Zenhub?

LW: That sounds worthwhile, do ahead and make an epic with some tickets under the project “Improved Training Methods and Data“

DS3-CSD Update:

 

From CSD there are: 230550 structures
xyz2mol_tm successfully converts: 217776
There are 51754 that match constraints for our primary dataset

  • 2031 are radicals

  • 3779 failed to form OpenFF molecules

  • 45156 failed to make inchi-keys

Leaving 788 structures left

This niche-key things is a problem, so I’m putting together a simple script to send to Magnus (xyz2mol_tm author) about this.

Do we need inchi_keys?

  • LW: We can prob skip those

There are no bi or tri- valent metal complexes, this might be a motivation to get our own access to CSD

Looking at the “other” elements that aren’t included in our project scope, Co or Ru have a lot of structures.

  • LW: I also see I in there, that should prob be in our primary dataset

  • JAC: It wasn’t in the strategic doc so I didn’t include it. Also MBIS charges don’t work for atomic numbers over 36.

  • LW: I didn’t realize, you might post on the bespoke channel and ask Danny about that.

  • JAC: Since the Chodera lab requested the multipole moments from MBIS charges, I don’t think we should add that in

  • LW: That makes sense, let’s bring it up in our next meetings and make sure that’s not a priority for Genentech, or that Chodera lab would be will willing to relax their constraints

  • LW: What are the constraints of the xyz2mol_tm smiles? Do we need CSD?

  • JAC: They exclude group 1 and group 2 and we need Mg

  • LW: It sounds like we do need it.

OpenFF Onsite

 

 

LW: 10 min talk for everyone. Your talk will be right before a deep dive, we look forward to getting stakeholder opinions on this.

Action items

Decisions