Transition Metal Forcefield Approach 1: Initial plan
Initial approach as suggested stakeholder consensus.
Overview
Summary | Create a QM dataset from an existing chemical structure databases running optimization and single-point calculations for structures of primary interest. Only datasets of primary interest will be produced, containing:
The final model chemistry (level of theory + basis set) must be determined for production. These datasets will be run with the following workflow:
To achieve this the standard OpenFF QCA dataset submission pipeline must be adapted to address incompatibility of existing OpenFF infrastructure with Organometallic CMILES |
---|---|
GitHub link |
|
Status | NOT STARTED IN PROGRESS COMPLETED WON'T PROGRESS |
Milestones and metrics
Datasets will be labeled as DS#-XXX-1, e.g. DS1-CCD-1, which denotes dataset 1, taken from the CCD database, with the chemical space of primary interest defined above. The first number and the middle three letter code are always paired to avoid confusion between similar database abbreviations, e.g. CCD vs COD vs CSD. The last number denotes inclusion of metal centers of primary and one of the secondary chemical space expansions that are outside of the scope of this approach:
Stage | Milestone/Benchmark | Contributors | Deadline | Status |
---|---|---|---|---|
Add ability for conformers to be imported into qc-submit | Assess ability for conformers to be added into qc-submit | @Jennifer A Clark |
| Not started |
Resolve qc-submit CMILES incompatibility with organometallic complexes | Determine if RDKit functionality will perform adequately | @Jennifer A Clark |
| Not started |
| If RDKit will not handle CMILES, skip for cif to qca interaction | @Jennifer A Clark |
| Not started |
| If RDKit will handle CMILES, assess work around for OpenEye, or implement error handling | @Jennifer A Clark |
| Not started |
Curate opt training dataset | Filter PDB Chemical Component Dictionary (CCD) and submit DS1-CCD-1 and DS1-CCD-2 at BP86 / def2-TZVP | @Jennifer A Clark, Brent Westbrook | Jan. 15, 2025 | In progress |
| Submit DS1-CCD-1 and DS1-CCD-2 at alternative model chemistries for assessment | @Jennifer A Clark |
| Not started |
| Choose model chemistry based off of DS1-CCD-1 and DS1-CCD-2 | @Jennifer A Clark ,@Lily Wang |
| Not started |
| Filter Crystallography Open Database (COD) and submit OPT DS2-COD-1 and DS2-COD-2 at GFN2-XTB | @Jennifer A Clark |
| Not started |
| Filter CSD (cambridge strucural database) and submit OPT DS3-CSD-1 and DS3-CSD-2 at GFN2-XTB with structures neglected by tmQM | @Jennifer A Clark |
| Not started |
| Filter MPtrj: Materials Project Trajectory Dataset and submit OPT DS4-MPT-1 and DS4-MPT-2 at GFN2-XTB | @Jennifer A Clark |
| Not started |
| Submit DS2-COD-1 OPT at target model chemistry | @Jennifer A Clark |
| Not started |
| Submit DS3-CSD-1 OPT at target model chemistry | @Jennifer A Clark |
| Not started |
| Submit DS4-MPT-1 OPT at target model chemistry | @Jennifer A Clark |
| Not started |
Curate electronic properties training dataset | Define primary and secondary properties of interest | @Jennifer A Clark , Chris Iacovella |
| COMPLETED |
| Determine output protocol of primary properties of interest and implement | @Jennifer A Clark |
| IN PROGRESS |
| Determine output protocol of secondary properties of interest and implement | @Jennifer A Clark |
| IN PROGRESS |
| Submit DS1-CCD-1 Electronic Property calculation at target model chemistry | @Jennifer A Clark |
| Not started |
| Submit DS2-COD-1 Electronic Property calculation at target model chemistry | @Jennifer A Clark |
| Not started |
| Submit DS3-CSD-1 Electronic Property calculation at target model chemistry | @Jennifer A Clark |
| Not started |
| Submit DS4-MPT-1 Electronic Property calculation at target model chemistry | @Jennifer A Clark |
| Not started |
Progress and findings
Curated data (or similar title)