Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Item

Notes

Organizing project pages

  • Problem statement and permanent links should be on the top page

  • Meeting notes should be on child pages

  • Research data + code should go in a dedicated repo, with major figures/conclusions as release assets

Review of reproducing case

  • JW – I don’t see why the SHAKE and maxcyc=0 AM1 optimizations should have nitrogen charges that are so drastically different (initial case had them at -0.75 for protonated, and -1.0 for deprotonated. maxcyc=0 and SHAKE cases had them around 0.0?)

    • CD – It could be that the initial hbond lengths are really bad (as indicated by a high gradient) and that is disrupting the e- distribution.

    • CD – We could try running with maxcyc=20 or something like that

    • CD – We could also try using MM terms to hold the bonds together

    • JW – When you do the openff optimization, try using the “unconstrained” version of the FF (openff_unconstrained-1.3.0.offxml)

    • JW – Could also try running it for a few iterations (maxcyc=5?) WITHOUT SHAKE, and then putting it in.

Proton transfer dataset generation

  • CD sent SB his scripts, SB will run those on a large dataset on Lilac and report back failures.

  • CD’s automation indicated some non-isomorphisms in the tripeptides-with-oppositely-charged-AAs sets, but he hasn’t looked closely at them.

PDB connectivity guesses are really bad

  • CD – Previously, JW had mentioned using QCElemental as an alternative to OpenEye

  • General – The initial molecule dataset (Minidrugbank.sdf) was such garbage that we can’t really use those results. So CD will round-trip all of those molecules through OpenEye and back to SDF so that we at least have a valid molecule for each.

  • Today, we’ll work on reading molecules from PDB using RDKit, guessing their bonds using QCElemental, and running the isomorphism checks from that.

  • (We made a code snippet to do pdb connectivity comparison that doesn’t use OpenEye, just rdkit)

    • CD – This is already removing a lot of the error cases that were spuriously coming up

Code Block
from rdkit import Chem
from openff.toolkit.topology import Molecule

rdmol = Chem.MolFromPDBFile('sqm_original.pdb', removeHs=False)
mol_from_pdb = Molecule.from_rdkit(rdmol, 
                                   allow_undefined_stereo=True,
                                   hydrogens_are_explicit=True)

input_mol = Molecule.from_file('input_original.sdf')

mol_from_pdb.is_isomorphic_with(input_mol, 
                                bond_order_matching=False,
                                formal_charge_matching=False,
                                aromatic_matching=False,
                               )

...