| |
|---|
Update Dataset Tracking | Project Board; Slides Running PR 440: Chodera tmQM PR449: In scientific review JW Agrees with everything JC says
|
QDS handling of non-QCSubmit dataset. | Scaffold Submission PR is completed and approved |
Dataset archival project | Complete! |
MolSSI Info / Align Priorities on MolSSI Asks | Notes July 8th meeting New from last QCAUM meeting: New release QCF v6.2! Ben offered VTech resources 96 cores + 768 GB. He has lots of CPU time in my allocation that I don’t use. Issue with large dataset views, Ben suggested saving without trajectories: ij = ds.create_view(
description=f"Full {ds_name} {ds_type} dataset",
provenance={},
include=['**'],
exclude=["wavefunction", "trajectory"],
include_children=True
)
However include_children=True overwrites this open, so it turns out that include_children=False will greatly reduce the file size (217 GB to 900 MB for the Industry Benchmarking Dataset v1.2) Requests: |
Old Issue of the Week | Conformer generation should fall back to RDKit ETKDG on Omega failures (Closed!) John suggests that if Omega fails in generating initial conformers, RDKit should be the fallback. Should this be a QCSubmit ticket?
Bonus: Missing chemistry to (potentially) cover post-release-1 (Not addressed this week) [#8]~[#35]: O-Br single bonds are present in GAFF2 but not present in our current datasets. We could port in a placeholder value from GAFF2, but there are no molecules with this chemistry in our current datasets.
[#7X3]~[#7X3]~([#8])~[#8]: Nitroamines
[#6:1]~[#6:2]=[#15:3]~[#6:4]: C=P double bond (potentially with adjacent singles)
|
AI summary | QDS/QCSubmit Meeting Summary (July 15, 2025)Current Projects and ProgressJennifer is working on converting QC Elemental molecules to RD Kit molecules to enable sorting based on fingerprints and connectivity This capability will help with better train-test splits of data and is of interest to Chris as well For molecules failing to assess in TMQM, Jennifer plans to use MoleAssembler to build complexes of interest
Data Set IssuesMany calculations showed SCF convergence failures after 500 iterations 14 out of 30 errored structures had incorrect charges reported in CCD files Jennifer is developing a method to predict oxidation states, which she'll present to Richard on Thursday For the remaining 16 structures with issues, Jennifer plans to implement a tiered optimization approach with loose tolerances initially
Completed and Ongoing TasksScaffold submission PR is completed and approved Jennifer needs to submit a PR with the new QC Fractal version Data set archival is complete but waiting for James and Lily to return before closing Using include_children=false reduced dataset size from 217GB to 900MB
QC Fractal and TestingBen contacted Jennifer about testing OpenFF code against a development branch of QC Portal Jeffrey demonstrated how to set up CI testing with a different version of QC Fractal For an old issue regarding conformer generation, they discovered RD Kit is already implemented as a fallback for Omega failures
|