2023-10-20 Meeting notes

 Date

Oct 9, 2023

 Participants

  • @Lily Wang

  • @Marcus Wieder

  • @John Chodera

 Goals

  •  

 Discussion topics

Item

Notes

Item

Notes

Datasets

  • SPICE 2.0 at OpenFF level of theory

    • Peter Eastman’s stuff

    • other useful datasets to OpenFF et al.

      • e.g. PDB Chemical Component Dictionary – includes coverage of compounds like nucleic acids, but not in a targeted way

      • Enamine REALSpace : Enamine building blocks + representative robust coupling chemistries

    • philosophy is a lot is generated at OpenFF level of theory

    • PE hoping to combine 2.0 with 1.0 dataset

  • nucleic acids dataset

  • virtual sites fitting (e.g. sulfur, sigma holes, nitrogens, )

  • electrostatic potentials and electric fields (possibly at a different level of theory?); Danny Cole / Josh Horton are looking into this as well

  • general chemical diversity dataset (from ChEMBL)

    • OpenFF is currently fragmenting and clustering based on fingerprint similarity; quickest and easiest solution at the time

    •  

  • Question: What kind of dataset is best to generate? OptimizationDatasets? TorsionDriveDatasets? MD-generated snapshots? MD-generated snapshots with a few steps of minimization?

    • OpenFF currently doing experiments to determine optimal datasets, with results expected in next couple of weeks

  • OpenFF is currently lacking expertise in areas that pharma is interested in

    • Currently just looking at ChEMBL

    • JC – everyone is interested in enamine realspace

      • We have the SMARTS and SMIRKS strings for those

      • Patent spaces

      • We have scripts for scraping patent spaces

      • JC will share these

 

 

 

 

 

 

 Action items

 Decisions