...
Excerpt |
---|
Initial approach as suggested stakeholder consensus. |
👀 Overview
Summary | Create a QM dataset from an existing chemical structure databases running optimization, torsion-drive, and *new* electronic property calculation types. To achieve this the standard OpenFF QCA dataset submission pipeline must be adapted in multiple ways.
| |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GitHub link | ||||||||||||||||||||||
Status | ||||||||||||||||||||||
Status | title | Not started
|
Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
Stage | Milestone/Benchmark | Contributors | Deadline | Status | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Add ability for conformers to be imported into qc-submit | Assess ability for conformers to be added into qc-submit |
| ||||||||||
Resolve qc-submit CMILES incompatibility with organometallic complexes | Determine if RDKit functionality will perform adequately |
| ||||||||||
If RDKit will not handle CMILES, skip for cif to qca interaction |
| |||||||||||
If RDKit will handle CMILES, assess work around for OpenEye, or implement error handling |
| |||||||||||
Curate opt training dataset | Work out best level of theory for the training dataset | November 10, 2024Filter PDB Chemical Component Dictionary (CCD) and submit DS1-CCD-1 and DS1-CCD-2 at BP86 / def2-TZVP (split metal centers of primary and secondary interest) | Jennifer A Clark, Brent Westbrook | Jan. 15, 2025 |
| |||||||
Compute training dataset | December 31, 2024Submit DS1-CCD-1 and DS1-CCD-2 at alternative model chemistries for assessment |
| Curate testing dataset | Compile QM dataset | ||||||||
November 30, 2024Choose model chemistry based off of DS1-CCD-1 and DS1-CCD-2 |
| |||||||||||
Compute QM dataset | January 31, 2025Filter Crystallography Open Database (COD) and submit OPT DS2-COD-1 and DS2-COD-2 at GFN2-XTB |
| ||||||||||
Compile simulation test set (Free Solv, maybe non-hydration solvation free energy sets that are harder to reproduce) | April 15, 2025Filter CSD (cambridge strucural database) and submit OPT DS3-CSD-1 and DS3-CSD-2 at GFN2-XTB with structures neglected by tmQM |
| Determine best NN architecture | |||||||||
Implement attention-based GNN | December 31, 2024 | Filter MPtrj: Materials Project Trajectory Dataset and submit OPT DS4-MPT-1 and DS4-MPT-2 at GFN2-XTB |
| |||||||||
Submit DS2-COD-1 OPT at target model chemistry |
| |||||||||||
Implement bond features in GraphSAGE (?) | December 31, 2024Submit DS3-CSD-1 OPT at target model chemistry |
| Determine best architecture | |||||||||
January 31, 2025Submit DS4-MPT-1 OPT at target model chemistry |
| |||||||||||
First pass at NN training | Train using just ESPs, dipoles, quadrupoles | Feb 28, 2025Curate electronic properties training dataset | Define primary and secondary properties of interest |
| ||||||||
Determine output protocol of primary properties of interest and implement |
| |||||||||||
Determine output protocol of secondary properties of interest and implement |
| Benchmark 1: QM | Neural network charge model with low testing error on QM data (ESPs, dipoles) | March 15, 2025 | Re-train VDW terms | March 30, 2025|||||||
Submit DS1-CCD-1 Electronic Property calculation at target model chemistry |
| |||||||||||
Submit DS2-COD-1 Electronic Property calculation at target model chemistry |
| Re-train valence terms | ||||||||||
April 15, 2025 | Submit DS3-CSD-1 Electronic Property calculation at target model chemistry |
| Benchmark 2: Simulation | Neural network charge model with equivalent or better performance to NAGL in simulations | April 30, 2025||||||||
Submit DS4-MPT-1 Electronic Property calculation at target model chemistry |
|
📊 Progress and findings
Curated data (or similar title)
...