User issues | Hyesu | Problems loading existing dataset to combine with new molecules when I tried to load dataset.json.bz2 , get validation error that all dataset entries are missing fixed hydrogen inchi JH: recently added inchikey validation, but it means old datasets don’t have it TG: could use older version? [conclusion] Hyesu needs atom map functionality, and so not possible to use old dataset.json directly
HJ: is it expected to have tautomers enumerated in optimization datasets, or can we skip them? TG: it’s up to you, if you want to have them or not HJ: Do we still have problems with order dependence between enumerating protomers and tautomers JH: only an issue for torsiondrive datasets, if you tag dihedrals first or not JH: for the optimization dataset, can you send me the SMILES list you’re using?
TG: if we add and remove things from the data structure, we’ll have issues like this one JH: will always be able to go back to previous version of qcsubmit; otherwise really hard to support full backwards compatibility DD: what if take the approach of treating QCSubmit as a strict structure that you must abide by if you want to use its workflow components; can always pull apart old dataset.json s directly as pure python objects? TG: now have the problem of requiring expert knowledge of data structures for users JH: would prefer to make the new fields optional so they don’t trigger pydantic validation; once it’s existed for some time and is very stable, can make it required DD: a bit like a reverse deprecation; at some future release the validation will be required JH: yes, I like that
TG: another hot take: possible to make the code use inchikey if there, not use it if it’s not? [decision] make inchi keys optional from pydantic’s perspective
|