2020-08-11 QCFractal User Meeting notes

Date

11 Aug 2020

Participants

Discussion topics

Item	Notes
Updates from MolSSI	BP is teaching at MolSSI this week and is unable to join
Queue/Manager status	JH just submitted fragmented ligands from benchmarking set (JACS dataset, fragmented). Mix of torsiondrives and MM scans. Exciting new errors (some involving psi4 version queries). Also seems to be something up with automation. DD – Still working on incorrectly-registered incompletes.
User questions	Fragmenter status and plans
CMILES for disaccharide set	JW – Need CMILES to make sure we’re not misinterpreting the molecules we submit to QCA. JH – I was able to make CMILES from PDB. All bond orders were 1. DC – Original structures from GMML. They don’t recommend writing topologies. Writes PDB and OFF formats. Takes string of saccharide identifiers as input. Generally catches errors/bad inputs. DC – Can output PDB, OFF, and topology (.top or .prmtop) Options: Assume all bond orders are 1, use PDB + OE to interpret Downside is that this would only work for current set, we’d be back to this discussion again if we tried to do bond order >1, charged atoms, or mols with S or P During this meeting: OE can infer bond orders Downside – Hard requirement on OE. Make antechamber+tleap write mol2 and use format converters to reach sdf Upside is that they must be doing bond order perception during small molecule parameterization Upside – Fully open source pathway (ambertools, openbabel, rdkit) During this meeting: This pathway successfully generates bond orders Get unordered SMILES for mols and match up with PDB to get SDF/CMILES
	Path 1 (OE reading PDBs): python -c "from openforcefield.topology import Molecule; import sys; molecule = Molecule.from_file(sys.argv[1]); print(molecule.to_string()) "
	Path 2: antechamber -fi pdb -i glu.pdb -fo mol2 -o test.mol2 antechamber -i 1_ac.mol2 -fi mol2 -o 1_ac_sy.mol2 -fo mol2 -at sybyl -dr no obabel -imol2 1_ac_sy.mol2 -osdf -O new.sdf python -c "from openforcefield.topology import Molecule; import sys; molecule = Molecule.from_file(sys.argv[1]); print(molecule.to_smiles()) " new.sdf On structure with carbonyl and sulfate, OE interprets PDB correctly, antechamber loses track of bond orders + charge on sulfate (off-dev) jeffreywagner@JW-MBP$ diff out_ac out_pdb 1c1 < [H][C@]1([C@@]([C@](O[C@@]([C@]1([H])OS([O])([O])[O])([H])O[C@@]2([C@]([C@@]([C@](O[C@]2([H])C([H])([H])O[H])([H])OC([H])([H])[H])([H])OC(=O)C([H])([H])[H])([H])O[H])[H])([H])C([H])([H])O[H])([H])O[H])O[H] --- > [H][C@]1([C@@]([C@](O[C@@]([C@]1([H])OS(=O)(=O)[O-])([H])O[C@@]2([C@]([C@@]([C@](O[C@]2([H])C([H])([H])O[H])([H])OC([H])([H])[H])([H])OC(=O)C([H])([H])[H])([H])O[H])[H])([H])C([H])([H])O[H])([H])O[H])O[H]
Protein dataset CMILES	JH- After the call I have found that the CMILES in the initial protein dataset optimizations are not correct due to the input mol2 files having all of the bond orders set to 1. The optimizations are still correct however as the net charge of the molecule was still 0. The v2.0 dataset will fix all CMILES strings and fully complete the dataset.

Action items

David Cerutti (Deactivated) will convert current JSON to Bohr
David Cerutti (Deactivated) will tar+gz up final saccharide submission files when he updates github submission branch, then notify Horton and Dotson
Once above is complete,Joshua Horton will take PDBs from disaccharide submission and use OE to make corresponding CMILES and SDF before submission
Once above is complete, David Dotson will submit first batch of disaccharide set.
Joshua Horton will add notebooks to pull down protein / saccharide results into respective submission directories, and point Cerutti to them
David Cerutti (Deactivated) will make sure that protein optimizations have completed correctly, using notebooks posted by Horton. If so, he’ll notify Horton+Dotson.
If above is acceptable, Joshua Horton and David Dotson will submit the rest of the protein optimization as a “version 2” of the dataset, as well as under the DZVP basis set
We’ll wait on submitting ESP calcs until we get a green light from Hyesu that they’re being computed correctly.

2020-08-11 QCFractal User Meeting notes

Date

Participants

Discussion topics

Action items

Decisions