Ash Sage updates slide 5 JW – What do single conformer optimizations fit to if no relative energies are available? CB – When selecting multiple mols per parameter, is there good diversity? LW – I did comb through QCA and find eligible mols, and then filter down to prioritize diversity if there were more than my threshold number. But wasn’t possible in all cases. CB – I’d think multi-conformer fits might do better on relative energies but not geometries LW – Lots of noise added by data sparsity/the characteristics of data in QCA, since it’s a hodgepodge of mols from datasets submitted with different goals (so some of them were eg looking at slight differences in ring geometries, but others have different sorts of conformers and such)
Slide 6 JW + CB + DM – Wonder if there’s something in the data that’s been truncated on the truncated portion of the y axis CB – I wonder if this is telling us that we have all the data we need to achieve convergence. Did all the fits end up with similar numerical parameters? LW – Roughly, yes, the parameters have appeared to be about the same. CB – It’d be interesting to start dropping portions of the dataset and see when they start giving you different/degraded results LW – That’s a good idea. Also we start from MSM values in these fits, which might mean that those leave little to be improved upon. So we’d need to think about how to do this accounting for methodoligical differences. DM – Very interesting idea.
slide 7 JW – Possible that the MSMs values were as good as they were gonna get to begin with, and single looked better than multi because those fits took fewer steps CB – MMackey found that torsions were way off for certain chemical species because the parameter SMIRKS were defined in a particular way that clashed with some mols. That might be happening here, and outliers that aren’t visible on the truncated axis could be dominating the fit. LW – Outliers do range out quite far, I’ll include in future talks. CB – Do we still have strike team to work on this? LW – Strike team is basically me, because of reduced personnel we wait for people to report things instead of seeking them out. CB – How affected is our optimizer by outliers? LW – Hard to answer without a lot of thought.
slide 11 |