Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Time

Item

Presenter

Notes

Dataset filtering

  • Maat and Jang need to make code to make training data for the next fitting generation

  • HJ – Will be based on taking a large dataset of compounds, clustering, and picking the molecule with a lot of coverage

  • JM – Open to a lot of different options for how to do this

  • JH – Filtering could become a component in the submission workflow

  • JM – This dataset would be used for the Sage fit

  • JW – Are we approaching a limit on how much data we can feed into ForceBalance?

    • JW + HJ – Let’s assume no upper limit for now

  • HJ – We will use Roche set, coverage set, new set

What will dataset look like?

  • We have 200 torsion terms in our FF, so we’d want 5 scans for each torsion, so 1000 torsiondrives


QCArchive submission



Timeline

HJ – Running optimization takes ~1 day, sometimes need to run 5 times for data trouble.

JW – Assuming 2 days per torsion, time 1000 torsions.

Action items

  •  Jeffrey Wagner will tell John Chodera NOT to submit 50k dataset, or to submit at LOW PRIORITY. We will need bandwidth for this submission
  •  Jessica Maat (Deactivated)Hyesu Jang will coordinate to work together on this in the coming weeks. Target date for submission is March 20th
  •  Joshua Horton will make a checklist for pre-submission (bond orders requested, CMILES attached)

...