2020-03-05 Force Field Release meeting notes
Date
Mar 5, 2020
Participants
@Hyesu Jang
@David Mobley
@Lee-Ping Wang
@Jessica Maat (Deactivated)
@Christopher Bayly
@Simon Boothroyd
@Owen Madin
@Daniel Smith (Deactivated)
Discussion topics
Time | Item | Notes |
---|---|---|
10 min | QM training set generation strategy |
|
Meeting Summary
Discussion about clustering methods
Tree fingerprint, which is 2D molecular similarity measurement has been used for the current validation set ;
CIB suggested LINGO, which is an intermolecular similarity calculation method directly from SMILES strings;
CIB: One concern with using graph-based methods is that it can be too localized. Different scoring methods may be needed.
2. Training set and validation set
Diverse training set will inform generality of the input typing and diversity in validation set will be able to validate how general our parameter set is;
While focusing on training set generation, consideration on how to generate validation set should be given;
Including troublesome molecules from validation sets to training set for the next iteration is one strategy we may consider.