Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

Date

Participants

Discussion topics

Item

Notes

Fitting mols

  • Convert all 1.2.0 release tarball mol2s to SMI, then those to PDF

    • HJ – We wouldn’t have used ALL of the torsions in the structures in our torsiondrives.

      • Could highlight driven torsions eg.

          "entry_label": "c1cc[c:1](cc1)[CH2:2][N:3]2[CH2:4]CCCC2=O",

        in fb-fit/targets/td_OpenFF_Gen_2_Torsion_Set_1_Roche_2_020_C12H15NO/metadata.json

    • Will all vibration frequencies be present in optimizations?

      • Unlikely

  • Separate by data/QC job type?

    • Make two pdf and smi files – One with all unique molecules (period), and another with molecules separated by data type, and highlighting driven torsions.

      • Let’s take the low-hanging fruit first (just list all unique SMILES and 2D structures), and get more elaborate if people ask. Scrolling through ~20 pages of highlighted torsions isn’t feasible anyway.

  • Identifier in PDF, which is also attached to SMILES

    • Name? SMILES itself?

    • Try doing SMILES in really small font, and only have three or four rows, so that people can Ctrl-F for their molecule of interest in the PDF.

  • What’s the best way to make benchmarking 3D structures available?

    • The set is hard to wrangle (due to overlap with fitting data), so let’s not worry about 3D structures initially

Benchmarking mols

  • Just repost Hahn’s molecules

    • Are these the same as in benchmarkff?

Provenance?

  • Record process for converting 1.2.0 release package to SMI/PDF

    • Where? How?

      • versions-- at least conda env export

      • List steps in the website PR? Paste script?

      • Upload script/smiles/pdf of this set to release assets?

        • No. Those should be immutable.

      • Attach to a new release of one of our repos? New repo?

      • How do we handle stuff like this? It’s always done ad hoc

      • We NEED some simple, consistent dataset guidelines!!! Uncertainty here adds a ton of effort and builds institutional debt!!!

  • Version Hahn’s molecules in case we update dataset?

Future plans

  • Making this part of release checklist in the future?

  • Replacing mol2 with SDF in future FB fits?

    • HJ will look into this

Final table for upload (with links)

Dataset

PDF

SMILES

Structures

Training

Add to 1.2.0 release assets and link

Add to 1.2.0 release assets and link

point to release tarball

1.2.0 release benchmarking

(need HJ to confirm)

(Need HJ to confirm)

or could run the same script on the release-1-benchmarking tarball

Could run the same script on the release-1-benchmarking tarball

release-1-benchmarking tarball https://github.com/openforcefield/release-1-benchmarking/releases/tag/v1.0.0

This round of benchmarking

Hahn’s pdf

(is this the same as PR pdf? Could link to that)

Hahn’s smi

(is this the same as PR smi? Could link to that)

The SDF from this PR? Are molecules collatable/is it possible to get reconstruct relationships/relative energies? molecules/set_v02_six_ffs/trim2_full_qcarchive.sdf

  • No labels