2024-07-17 BW/LW Check-in

Participants

  • @Brent Westbrook (Unlicensed)

  • @Lily Wang

Goals

  •  

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

General updates and discussion on projects

 

  • Torsion fitting

    • fit finished – hpc3 is down today, hopefully will have benchmarking tomorrow

  • smee

    • Got TDs loaded, starting step 2

    • Probably expecting better results compared to just training to opt dataset, but worse than SPICE.

    • Around 13k (just 5k with opt)

  • lipids

    • ready to put through QCA submission/s – 72 molecules

      • TD – use 15 ° for now

    • Have the SMILES, just finishing up TD

    • One of the alkane parameters has 1.x M in chembl

    • Lipid MAPS produced ~47.5k molecules, ~450k fragments

      • Has 48k molecules

      • XFF 20% dataset had ~10k molecules, 128k conformers

      • Try to cluster 10-20% down to ~50k fragments?

      • Also do an entire molecule dataset

    • Happy to do training

  • benchmarking infrastructure repo?

    • “combinations” of QCRecord IDs

    •  

  • infrastructure work --

    • going well, working on QCSubmit

    • Caching PortalClient not threadsafe

    • Moving fragment database code into own repo, now technically installable, has a few tests

    •  

  • Next week: review of roadmap plan and where we are

    • Coming month

      • Focus on finishing up torsion multiplicity

        • Know we’ll be done if benchmarks aren’t worse

          • Benchmarking to OpenFF Industry 1.1 dataset

            • Dataset name: 'OpenFF Industry Benchmark Season 1 v1.1'

            • JSON file, probably the full dataset

          • Collect record IDs where connectivity changes, and QM is just wrong

            • Pass record IDs in a text file

            • With “bad” record IDs, need to remove them from sqlite database to fix up ddE calculations

          • Benchmark additional torsion optimizationdataset as well

          •  

 

 

 

Action items

Decisions