Contributors: David Mobley, Lee-Ping Wang, Hyesu Jang, Jeff Wagner, Chris Bayly, Josh Horton, Chaya Stern, Jessica Maat
Background: The Open Force Field Initiative is working on developing optimization training data sets via a fingerprint and clustering method. The aim of this project is to pull chemically diverse molecules from a range of data sets to survey a large chemical space for our May release force field.
Aim: The aim of this sub-experiment is to limit the number of conformers in a patented data set from Bayer.
Problem: The Bayer set contains large flexible drug molecules that range from 12-30 heavy atoms. Current fingerprint & clustering methods result in
Approach:
Conclusion: