...
Download/load in crystal structures for COD
Some code about parsing the COD using rdkit: https://pschmidtke.github.io/blog/rdkit/crystallography/small%20molecule%20xray/xray/database/2021/01/25/cod-and-torsion-angles.html .
Convert crystals into formats that OpenFF can use:
To get CIF into OpenFF, see PyCIFRW (https://github.com/openforcefield/openff-toolkit/pull/982/files )
Open Eye toolkit can also load CIF files: https://docs.eyesopen.com/toolkits/python/oechemtk/OEChemConstants/OEIFlavor.html?highlight=cif#OEChem::OEIFlavor::CIF
Use openBabel to convert CIF to pdb file and generate openFF Molecule object with pdb + SMILES string
Use PyMol to generate an appropriate unit cell and supercell based on crystal symmetry information in cif/pdb files. Use supercell to generate openMM topology.
Select structures from COF:
For now, just take structures that are organic crystals and not weird. Eventually, we will want to figure these out.
Can start by just selecting a few (5-10) COF files to get the process working.
COF has SMILES database, but some stereochemistries are not explicitly defined, which will cause errors when generating openFF molecule.
Minimize the structures using different force fields:
Start with OpenFF (different versions)
Example here:
could be used to start with (energy minimize rather than simulate)Github link macro link https://github.com/openforcefield/openff-toolkit/tree/latest/examples/SMIRNOFF_simulation Issue: we may need to do lattice minimization (minimize the box vectors as well as the atoms), which will require implementing some code.
Will need to minimize unit cells as well. Many programs implement this, we should figure out the best algorithms:
http://tutorials.crystalsolutions.eu/tutorial.html?td=optgeom&tf=opt_tut#geopin
https://www.scm.com/doc/Tutorials/ExternalPrograms/QEGeometryOptimization.html
Basically, one needs to minimize in 3*natoms + 6 variables (box vectors).
Use any optimization routine (in scipy?) and calculate the derivatives as follows:
For coordianate derivitives, calculate the force from OpenMM
for box vector derivatives, pick a spacing “esp”, and increase/decrease the box vectors by “esp” by calling the OpenMM energies, then calculate the finite difference derivative. Eps should probably be around 0.001 relative of the variable.
Observe how much the RMSD changes from the experimental structure.
Note: there are other more direct comparisons to experiment (see below), but let’s get the other things working first
Look at RMSD20 and RMSD15
Goals: make it a workflow that can run relatively easily and be plugged into benchmarking
Will synchronize with Simon Boothroyd once the basic version is working.
...