/
QM Data Selection

QM Data Selection

We’re planning a new round of QM dataset selection for Parsley minor releases and potentially subsequent force fields. Particularly, instead of simply fitting to the first/most appealing QM data we have on hand we want to achieve three goals in our datasets, in order of priority:

  1. All parameters are used at least once (“any” coverage)

  2. All parameters are used at least five times (“reasonable” coverage)

  3. Parameters are used in diverse chemical environments

This means designing a systematic procedure for selecting molecules for QM optimization/scanning from potentially available datasets (and possibly in some cases finding unusual chemistries outside datasets we have on hand).

We met to plan this on Feb. 25, 2020 and key decisions/notes are here: https://openforcefield.atlassian.net/l/c/wHiQgmWR

Detailed plans will be made in QM Datasetsarchived (Data space).

Key personnel:

  • @Jessica Maat (Deactivated)

  • @Hyesu Jang

Supervising/assisting:

  • @Lee-Ping Wang

  • @David Mobley

 

Relevant meeting notes:

Related content

Filtering conformers for training
Filtering conformers for training
Read with this
QCSubmit
More like this
Sage rc1 refit including dihedral_rmsd in optgeo target
Sage rc1 refit including dihedral_rmsd in optgeo target
Read with this
2025-01-28 QCA dataset submission meeting
2025-01-28 QCA dataset submission meeting
More like this
QCE/Psi4 notes
QCE/Psi4 notes
Read with this
2020-08-12 All-hands meeting notes
2020-08-12 All-hands meeting notes
More like this