JH – Prototype is getting near complete. There’s a bit on jank, where a process has to babysit the QCA snowflake and error cycle.
JH – Did some binning to try to get good or bad torsion profiles. But the logic had trouble separating the good from bad.
DM – Probably better to move quickly rather than worrying about miscategorizing a few cases. It seems inevitable that a really good binning system will require a lot of human review.
JH – I’ll send out a summary of what I’m finding now, and which molecules/parameters are good/bad
JH – Not sure how FB internals work. Is there preference toward matching minima? Is there an energy cutoff?
DM – Yes, but not sure about the specific details.
DM – Basically, matching minima is most important, followed by getting low barrier heights right, and making sure impassably large barriers remain impassable.
JA – So barrier heights near kT are important, anything much higher is less important.
JA – Looking at these plots, the RMSD may be covering up some details, so it may be better to check out the maximum deviation for each molecule.
DM – When I look by eye, I largely want to see if the “shape is the same”. So if there’s a different number of peaks or minima, the I suspect that sterics are causing trouble. In those cases, we would want to avoid including those high-steric-energy structures in the torsion fitting.
JA – Some measure of curvature may be the best way to go here, and deconvoluting the energy contributions.
JH – Yes, I think that signed curvature is a useful metric here.
DM – What logic is applied to the curvature?
JH – Yes, we do al allclose comparison between the two arrays
DM – It may be good to downweight points based on extremely nonsmooth points in the MM energy landscape
JH – Concerned about waste and miscategorization close to the cutoff – Lots of good stuff could be thrown out near the barrier
JA – It’d be good to see the steric energies on the plots.
JW – We could list all possible directions and try to sort to find the ones with the highest returns first
DM – Point filter could be a generic “black box” function that we implement minimally now, and future refinements could just work on improving this box.
(General) – In bespoke workflow, could use a similar black box function to look at QM torsion scan and see if some poitns should be removed.
Bespoke workflow architecture
4 processes:
bespoke.py
ForceBalance
Archive manager
Archive worker manager
Archive worker(s)
ANI jobs take 30 seconds each
Conda environment should be consistent – Could put package up on omnia/label/rc
Chemper utility/future?
JW – Would we imagine using bespoke fits to all torsion drives followed by Chemper collapsing to make a FF?
DM – Two shortcomings of that would be 1) we don’t have complete coverage of chemical space and 2) we’d need to develop a metric of when two parameters are “close enough” to be merged.
JH – How should I evaluate whether bespoke fits are improving the models?
JH – Previously we did conformer energy ranking, coupled with high temperature MD.
DM – Could also optimize geometries that didn’t appear in scan and see if new minima are correctly ranked
DM – Binding free energies are the ultimate goal, but it’ll be hard to use that as a benchmark, and may be misleading. Could talk to JC or HBM to try putting bespoke fits in place of where they’ve used ANI or other FFs in their benchmarking.