Date
Participants
Goals
Determine new procedure for selecting QM datasets for fitting (potentially for May meeting release, if ready in time)
Divide up work to accomplish that
Notes:
Coverage of current QM dataset (Su)
...
Talk with Jeff: Class structure? script? Where does it live?
Action items
- Hyesu Jang Create Confluence page listing all available datasets (with Jessica Maat (Deactivated) enlisting David Mobley as needed)
- Jessica Maat (Deactivated) develop prototype notebook which takes a FF and a set of molecules and a target parameter (ID) and picks the five most diverse molecules using that parameter. Should also take an optional argument which is a list of molecules to exclude (so that molecules which have already been used in other sets can be skipped)
- Jessica Maat (Deactivated) and Hyesu Jang reach out to Jeffrey Wagner to discuss architecture of tools to be constructed, plan for sustainability and for where they should live. [Scheduled this meeting for Wednesday March 4 10 am -JM]
- Hyesu Jang to determine how to enumerate protonation states and tautomers without doing semiempirical calculations (to speed set prep) talking to Chaya Stern (Deactivated) if needed, or if it can’t be done via that route, getting back to David Mobley for help with ideas. [ Update from Chaya: “@Hyesu Jang, the
states
module infragmenter
generates reasonable protonation / tautomer states. It uses quacpac and does not need AM1 calculations so is fast.
https://github.com/openforcefield/fragmenter/blob/master/fragmenter/states.py “ - Jessica Maat (Deactivated) and Hyesu Jang to come up with their goal timeline
Decisions
- Decided to make systematic approach for selecting molecules for QM data generation & fitting given a target dataset; this will be applied dataset-by-dataset to select new molecules for use in fitting
- Will attempt to select/redesign a new QM dataset for fitting rather than simply extending our prior QM dataset
- Decided on tentative algorithm for molecule selection approach