This is because there aren’t any structures in the CIF file that just have one terminus charged? The only one with an N term charge for PRO is the zwitterion, which also requires a C terminal charge to match.
In the zwitterion form of proline, for some reason one of the Hs on the N terminus is marked as a leaving atom. This isn’t true of the alanine zwitterion.
C terminal residues aren’t having their oxyanion being labeled
Kinda a similar problem to above, where there aren’t any C-terminal-charged substructures that don’t also have an N terminal charge
Caps aren’t being identified
Will need to monkey-patch those in, they’re not in the CIF file at all.
In general, how should we get N- and C-terminal AA substructures? It doesn’t seem like there are enough permutations of structures in the cif file. So we’ll almost certainly need to have a dictionary of backbone reactions to take a mainchain substructure and turn it into the different termini. This may be tricky with things like Proline, where a charged N term in NH2+
Could play “calvinball” and come up with rules like “delete all permutations of leaving atoms starting from the outermost points of the graph, and if they’re C terminal then leave an oxyanion after deleting the H, but if they’re N terminal then don’t modify the charge of the N…” but this seems really hacky. It may be better to have explicit reactions that we apply and names for the atom(s) they add.
Loading from PDB
LW – Things that don’t work:
Molecule.from_pdb_and_smiles fails on stereo
openmm.PDBFile and Molecule.from_openmm fails and I’ll fill in the reason later. Could have been assumptions about bond orders
Setting all bonds to single+formal charges to 0 may have worked, but it’s hacky
0 Comments