7 parties interested in running calculations on public QCArchive
1300-1400 compounds each, selected according to individual criteria (as discussed)
structures generally patented
GT: should it be a 3D SDF with single conformation, or multiple?
DH: 3D SDFs better because SMILES don’t have chirality information
XL: may choose things similar to what’s in pubchem, elsewhere
JH: if we don’t specify everything, we’ll be relying on rdkit to generate conformations, etc.
DH: we’re planning for conformer generation in the workflow
JH: could fill in the gap if not provided
GT: default is folks provide a single conformation at least, we generate/fill in up to 10.
GT: what about charged compounds?
JH: if it is charged, definitely want charge specified; all the initial fits weren’t done on charged molecules
GT: think you will be able to handle them; question remains on the basis set
JH: if the charges are defined in the file, that would help
GT: need to determine which fields in the SD file we use
GT: a thousand neutral molecules in 3d with hydrogen, then as part of the workflow charge them with rdkit
DD: the public QCArchive submission can be used as a test approach with high visibility; can decide today on a reasonable, perhaps minimal input spec and see where problems arise
GT: neutral 3d input, take a week to look for a reasonable open source ionization predictor
Conclusion: neutral 3D input, with hydrogens specified; deferred decision on protomer enumeration