Use Gemmi to handle dataset
- PDB Chemical Component Dictionary (CCD)
- Brent started, - download and filter (Brent)
- Submit dataset
figure to out what's going on with our PDB dataset, all issues are SCF convergence - tmQM [paper, dataset] (which sources from CSD) - 86K transition metal complexes
Utilize this existing datasetto begin testing out NNP strategy regarding encoding spin stateall spin states = 0.
Currently being worked with by Chodera lab, should get into QCArchive at some level - Crystallography Open Database (COD) – CC0 licensed
- CSD (cambridge strucural database) filtered for stable small molecules, size, elements
might need to discuss release of subset as open data
Look for structures neglected by tmQM of interest? - MPtrj: Materials Project Trajectory Dataset