2024-06-12 All-hands meeting notes

Participants

  • @Jeffrey Wagner

  • @James Eastwood

  • @David W.H. Swenson

  • Ethan Holz

  • @Brent Westbrook (Unlicensed)

  • @Daniel Cole

  • @David Mobley

  • @Chapin Cavender

  • @Matt Thompson

Recording: https://drive.google.com/file/d/16MUBLoFoFhnsrHBzaSUOSY43QBvSnctc/view?usp=sharing

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Infrastructure Updates

JW

Science Updates

LW (video)

  • https://drive.google.com/file/d/1y2T04nmYhvJeain4sxb7TIYmtoJBaObw/view

  • JE – BW, at ad board, they asked for a way to help them tell us whether their project molecules are covered by our FFs. I recall you had a sort of tool to do this.

    • BW – My tool kinda does this, but not exactly what they need.

    • DM – Would be good to do a kinda fingerprinting thing to compare their mols to our training/benchmark sets.

    •  

    •  

    •  

Eco-Infra Updates

David Swenson

  • https://docs.google.com/presentation/d/1SkuFVWkowpXX2vosQUfb_5cMXktXSbmoWlH9rHv8IPU/edit?usp=sharing

  • MT – Re: prioritizing -

    • MolSSI cookiecutter is already quite good and just needs a little TLC, I’m not sure a total rewrite is needed.

    • Conda-forge best practices are fairly well documented but big peculiarities (like “don’t mark c-f packages as broken, even though that’s standard on pypi”

    • My messaging/notifications situation is chaos, would love something that ties this all into fewer streams.

  • JE – I like the idea of streamlining notifications. OpenFE has some cool ideas for how to do this.

  • DM – GH usage analytics are really important for all sorts of grant-getting activities. Especially some sort of way to figure out which of our conda-forge downloads are real.

  • JW – User story for large object, long-term storage

    • DS – Sounds like one element is storage and ownership. We don’t want to own things like that.

    • Another element is indexing. Maybe we could create an index?

    • Sounds like a lot of the same problems the Nomad project has tried to solve. Maybe we can try to interface with that project. https://nomad-lab.eu/nomad-lab/

    • JW – From my understanding, ForceBalance run doesn’t fit in Nomad, though the QCArchive data might.

    • What we need may be more of a playbook than a piece of infrastructure.

  • DM – JC had a dream that you’d type in the DOI of a paper, and some tool would refer you to supporting data packages, download the software, and reproduce the study. So, like, there could be some tool that we could point at a Zenodo entry for a sage release, and it goes and grabs the data and reproduces (at least part of) the fit. The tool could, at a high level, be considered a validator for our data artifacts as well.

    • JW – a tool that interacts with a zenodo entry and creates a folder to download and unzip to, could be nice

    • CC – I like DM’s idea - maybe could incorporate GH actions, and the initial part could be a minimal test of each of the steps so that people don’t have to run a big heavy calculation.

    •  

  •  

  •  

Shout-outs

JE

  • Josh Horton has been short-listed for a Newcastle Open Research Award

  • JM has completed the final recording and upload of the virtual workshop series videos and materials

Q&A

 

 

Action items

Decisions