Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

"Include a small subset (~100 molecules) of small molecules relevant to OpenFF (for distinguishing levels of theory etc below)"

Has this been done? [Ask Chodera Lab]

Search through existing databases

  • PDB Chemical Component Dictionary (CCD)
    [Brent achived]
  • tmQM [paper, dataset] (which sources from CSD) - 86K transition metal complexes
    Utilize this existing dataset to begin testing out NNP strategy regarding encoding spin state.
    Chodera Lab has it?
  • Crystallography Open Database (COD) – CC0 licensed
  • CSD (cambridge strucural database) filtered for stable small molecules, size, elements
    might need to discuss release of subset as open data
    [Repeat of tmQM?]
  • MPtrj: Materials Project Trajectory Dataset

Simple augmentations [start working on in the background? Or clean existing and "batch" produce]

  • Any transition metal for any transition metal, let it run and see what happens (will get something relevant most of the time).
    Switching within column is almost always okay.
    [Will need in housr tmQM for this]
  • substitutions: H->F, Ph->pMeOPh.
    Consider RDKit has "replacesubstructs" method

Conformal search:

  • take from CIF files (CIF is the new PDB)

CMILES Issue: Organometallics are difficult, not supported

  • Jeff thinks cif -> rdkit -> qca
    - not sure that's necessary because QCArchive has a hash to represent the molecules.
    - Don't have to use CMILES as the name (i.e., index), can be arbitrary.
    - QCArchive doesn't need CMILES, Prepare to pair program with JClark on cif --> QCA pipeline by bypassing openff molecule structures that are dependent on CMILES and directly compare cif files to QCArchive
  • RDKit had an organometallic class when assessing implicit hydrogens, reverse engineer an expression?
  • From 09-05 notes, CI: this is something we’ll need to consider for QCArchive too. It’s one Record per conformer, so we need to be able to associate a metadata record with a record. [QCArchive has a metadata file field to add such information]

Computed/stored properties

  • energies

  • forces

  • other properties

    • atomic spin density

    • partial charges (multiple methods)

    • Dipole moment / polarizability

    • orbital energies (+/- 5 molecular orbitals around highest occupied molecular orbital)

    • Allow us to see electronic structure of complexes

      • relative contributions of each atom to each orbital

      • coefficients? - too large!

  • This level of theory was used to compute the following properties: electronic and dispersion energies, HOMO and LUMO energies, HOMO/LUMO gap, dipole moment, and metal center charge, which was derived from NBO

  • Make a new dataset type for these properties
  • Determine the keywords for psi4 to obtain these, change provided output? (Note tmQM used Gaussian NBO analysis for these, where output is trivial.)
    OptimizationResultCollection.create_basic_dataset pull final geom from opt and then create input for qca SP with these properties.
  • No labels

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.