Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Item

Presenter

Notes

Project updates

BW

  • Past week

    • Refiltering industry dataset

      • LW: Is this meant to be the required way people do it (re: Github PR)?

        • BW: No just recommended/example

      • LW: When would we want to re-filter it?

        • BW: If packages update or if we want to add new datasets

      • LW: Should we over-engineer it and put dates/numbers on it now? Might be good to keep some standards up to date

      • (Everyone): How to handle outliers?

        • Unclear, kind of intersects with other discussion of bad QCA records. Table until tomorrow’s meeting?

        • Also would we want these versions to be a “release” dataset

    • qcsubmit and bespokefit PRs

    • lipidmaps dataset

    • test organometallics dataset

      • BW: filtering for >10 atoms, q <4, periods in smiles

  • Next week

    • Run benchmarks

    • lipidmaps dataset

    • organometallics dataset

Project

updates

AMI

  • Past week

    • Sage 2.2.1 + S data + TM data/splits result

      • P angle not improved

      • (Conclusion): re-fit with P angle frozen, lower priority for now, could investigate splitting angle in future

    • DDX error analysis

      • Got it down to 5%

      • Need to delete records from dataset and resubmit them with a new guess

      • Probably can’t do it through QCA-DS

    • AIMNet lit search

  • Next week

    • More on DDX dataset--re-submit stragglers with new guess, start optimizations

    • Look into NAGL architecture

    • Look into testing dataset

LW

  • Past week:

    • CSIRO talk

    • Evaluator on NRP!

    • Interchange packing and simulation (+Evaluator)

  • Next week:

    • Protein stuff?

    • Evaluator (virtual sites)

    • Interchange follow-upsWrapping up PRs (QCSubmit fix, etc)

    • Resubmitted hessian dataset

    • Monitoring QCA workers

      • MLPepper ran in 2 days

      • Lipid maps is going slowly as it’s not split by size – the resource requirements vary. Also, scaling deployment > 60 workers means everything crashes

    • yammbs PR

    • Benchmarks

      • PR openers upload:

        • a yaml file with config

        • Optionally, a FF file

      • Things we could review:

        • CSV files that get committed to repo

          • Human input: check they got uploaded

          • LW: can script generate plots for you?

          • BW: that’s possible

        • Zenodo submission (which is now on production OpenFF)

          • Note: YDS doesn’t publish entry. To review Zenodo, you need to be able to log into Zenodo.

          • Someone needs to review and hit publish, to include DOI, to include in PR.

          • BW: picturing DOI as a line in a README

      • AMI: is the idea to review before the run starts or after it ends?

        • BW: afterwards – there’s not a whole lot to review before the run. Just the yaml file and maybe an FF.

        • AMI: there’s not that much an external person could review afterwards either, other than checking the CSVs are there. Don’t really see downside to requiring a review

        • BW: just worried about best practices. Also need write permissions to trigger bot.

  • Next week

    • More benchmarks

    • QCA workers

    • Split up second lipidmaps dataset

      • Separate PRs into the same directory

    • Organometallics dataset

    • yammbs PR(s)

    • YDS

      • Update scripts with plots

      • Publish Zenodo entry and merge PR

      • Which FFs?

        • 1.3.1

        • 2.0

        • 2.1

        • 2.2.1

        • experimental FFs

Project

updates

AMI

  • Past week

    • mamba issues

    • DDX errors

      • looked into functional group breakdown

      • Also resubmitted same dataset without diffuse functions. No errors.

    • NAGL2 optimization dataset planning

      • Decided to do geom opts at usual level of theory

      • BP: can do 2.5-5 TB dataset. “Let’s see how well we can handle it”. “no problem” to do up to 1 TB.

      • AMI: 5 confs/mol would be ~2.5 TB. 5 TB would be 10 conformers. 2 confs/mol would be 1 TB.

      • LW: aim for 5 conf/mol

    • Follow up on standards document

  • Next week

    • NAGL2 optimization dataset

    • Hessian DSs

    • (Freeze phosphate angle refit?)

    • (NAGL2 testing dataset--examine exisiting coverage and maybe setup opts)

    • Finish standards

LW

  • Past week:

    • Working with various Kubernetes issues

    • Troubleshooting packmol

    • Collaborations and admin

  • Next week:

Action items

  •  

Decisions