MT – Tried running the workflow on ~1600 mols. Graphs look really wonky - RMSDs up to 140, ddes in the hundreds
LM – The big ddEs seem like what I’d seen before - I saw ddes in the thousands because atom indexing got scrambled.
JW – Agree that ddE issue is probably atom indexing. I think the large RMSDs are probably a lack of superposition
MT + LW – OE RMSD call didn’t appear to be aligning. Also all of the value seemed to be 3 or higher, and I haven’t figured out why.
JW – If OE is acting funny, this could be worth switching to the RMSD code we used in the industry benchmarking project
LW – Will this be compared to the old benchmarking script?
MT – I think it’ll be really hard to get it to match - PB was probably using an old JSON file to load QCA datasets, so I’d need to adjust the inputs to work with that. But a rough statistical analysis of “these are close enough” could be right here.
BW – I was wondering what the old benchmarking workflow was like - How do things compare?
MT – It’s not the case that the old code didn’t work, but rather that it was disorganized and I was largely unaware of what it does, so this helps me learn what the fitting team is actually does.
JW – In the future we may be able to make out fork of ForceBalance stabler/more performant, but it’ll need to be a multi-month effort.
LW – Until we have FB under our umbrella, it’ll be hard to push changes through.
BW – Yeah, I’d really like better error reporting. That’s a big problem for me.
LW – Agree
MT + JW – With Openff-forcebalance, I’ve removed lots of unnecessary stuff. Currently giving different numbers but MT estimates it could be fixable. Not a priority right now.