2024-06-25 BW/LW Check-in

Participants

  • @Brent Westbrook

  • @Lily Wang

Goals

  •  

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

General updates and discussion on projects

 

  • Torsion splitting in 2.2

    • qca-dataset-submission: LW will look at it after the meeting

    • TD dataset – 1 error left, unlikely to resolve, stop re-running?

    • Opt dataset – still going

  • Fragment dataset curation code

    • Still part of previous repo

  • Infrastructure work

    • Going well – working on caching

  • smee

    • Modifying Simon’s code to load QCArchive data

    • LW: I’m not too familiar with huggingface datasets, but with PyArrow what you can do is:

      • write multiple PyArrow Tables to parquet files with chunks of data

      • Load the whole thing in as a pyarrow Dataset if you pass it the directory as the input path

  • eMolecules – probably CC-BY, coming soon

  • Next week’s check in is cancelled

 

 

 

Action items

Decisions

Â