Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Current »

Date

Participants

Goals

Discussion topics

Time

Item

Presenter

Notes

Dataset filtering

  • Maat and Jang need to make code to make training data for the next fitting generation

  • HJ – Will be based on taking a large dataset of compounds, clustering, and picking the molecule with a lot of coverage

  • JM – Open to a lot of different options for how to do this

  • JH – Filtering could become a component in the submission workflow

  • JM – This dataset would be used for the Sage fit

  • JW – Are we approaching a limit on how much data we can feed into ForceBalance?

    • JW + HJ – Let’s assume no upper limit for now

  • HJ – We will use Roche set, coverage set, new set

What will dataset look like?

  • We have 200 torsion terms in our FF, so we’d want 5 scans for each torsion, so 1000 torsiondrives


QCArchive submission



Timeline

HJ – Running optimization takes ~1 day, sometimes need to run 5 times for data trouble.

JW – Assuming 2 days per torsion, time 1000 torsions.

Action items

Submission checklist

  • Ensure all submissions have cmiles, most important are mapped hydrogen smiles
  • Ensure the WBO is requested for all submissions, this should be included in the scf properties list using the flag wiberg_lowdin_indices
  • If any calculations are to be redone from another collection re-use the old input (coordinates, atom ordering etc) used as this will avoid running the calculation again and will just create new references in the database to the old results and should help keep the cost of the calculations down.

Decisions

  • Jeff approves this being a pile of spaghetti code given time constraints
  • No labels