2024-02-07 FF Fitting Meeting

Participants

@Alexandra McIsaac
@David Mobley
Bill Swope
@Trevor Gokey
@Pavan Behara
@Michael Gilson
@Lily Wang
@Brent Westbrook (Unlicensed)

Goals

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
Outliers in the benchmarking dataset	LM	Recording DLM: looks like the 7 membered ring might be due to a proton rearrangement [?], we saw similar problems before. Best solution could be to filter it out of benchmark dataset PB: we thought filtering H-bonds might get rid of the problematic conformers LMI: I looked into that, but benchmarks seemed to indicate worse performance LW: could it be because the benchmark set includes the problematic conformers? LMI: maybe, need to look into it PB: re a32: we saw similar problems with this hypervalent S before, but we decided it was a rare chemistry and didn’t address it DM: do we need to go digging around for more data for these problems? LMI: haven’t scoped out the problem yet DM: I would spend maybe 1-2 hours to see if eMolecules or ChEMBL has enough molecules to expand dataset LMI/BW: SMARTS pattern might be hard part TG: Besmarts could handle this, but chemper might be better suited – you could take the smarts pattern from each of your molecules and query the database for those. Besmarts would give you the union instead. It would take all your chemical environments and find pattern that matches all those environments. If you use that pattern to query, it will select anything that matches LMI: hard to distinguish between the different angles being either 180 or 90 DM: I think in MM world you would want a multiple MM solution, since angles are indistinguishable TG: not quite true, all the CH3s is 90, but CH2 is different LMI: but if CH2 was CH3 it would probably still be same geometry. The substituent is specific to this mol TG: this problem may need specific smirks to handle these cases DM: ideas for dealing with this problem systematically? LW: are QM energies noticeably higher for outlier conformers? LMI: yes, ~0.2 hartree higher PB: we used to do more granular benchmarks where we looked at everything more than X% away from the mean, e.g. a bond length of more than X angstrom away LW: for ddEs we could apply Bill Swope’s modification to only consider conformers within 0.4 A and look at outliers for geometry targets DM: I think we want to be more aggressive in pruning benchmark set than fitting set Science team will investigate systematic ways to fix benchmark set

2024-02-07 FF Fitting Meeting

Participants

Goals

Discussion topics

Action items

Decisions