Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Participants

Goals

Discussion topics

Time

Item

Presenter

Notes

Dataset filtering

  • Maat and Jang need to make code to make training data for the next fitting generation

  • HJ – Will be based on taking a large dataset of compounds, clustering, and picking the molecule with a lot of coverage

  • JM – Open to a lot of different options for how to do this

  • JH – Filtering could become a component in the submission workflow

  • JM – This dataset would be used for the Sage fit

  • JW – Are we approaching a limit on how much data we can feed into ForceBalance?

    • JW + HJ – Let’s assume no upper limit for now

  • HJ – We will use Roche set, coverage set, new set

What will dataset look like?

  • We have 200 torsion terms in our FF, so we’d want 5 scans for each torsion, so 1000 torsiondrives


QCArchive submission



Timeline

HJ – Running optimization takes ~1 day, sometimes need to run 5 times for data trouble.

JW – Assuming 2 days per torsion, time 1000 torsions.

Action items

  •  Jeffrey Wagner will tell John Chodera NOT to submit 50k dataset, or to submit at LOW PRIORITY. We will need bandwidth for this submission
  •  Jessica Maat (Deactivated)Hyesu Jang will coordinate to work together on this in the coming weeks. Target date for submission is March 20th
  •  Joshua Horton will make a checklist for pre-submission (bond orders requested, CMILES attached)

Submission checklist

  •  Ensure all submissions have cmiles, most important are mapped hydrogen smiles
  •  Ensure the WBO is requested for all submissions, this should be included in the scf properties list using the flag wiberg_lowdin_indices
  •  If any calculations are to be redone from another collection re-use the old input (coordinates, atom ordering etc) used as this will avoid running the calculation again and will just create new references in the database to the old results and should help keep the cost of the calculations down.

Decisions

  • Jeff approves this being a pile of spaghetti code given time constraints