Page Comparison

...

Item	Presenter	Notes
Project updates	BW	Past week Refiltering industry dataset LW: Is this meant to be the required way people do it (re: Github PR)? BW: No just recommended/example LW: When would we want to re-filter it? BW: If packages update or if we want to add new datasets LW: Should we over-engineer it and put dates/numbers on it now? Might be good to keep some standards up to date (Everyone): How to handle outliers? Unclear, kind of intersects with other discussion of bad QCA records. Table until tomorrow’s meeting? Also would we want these versions to be a “release” dataset qcsubmit and bespokefit PRs lipidmaps dataset test organometallics dataset BW: filtering for >10 atoms, q <4, periods in smiles Next week Run benchmarks lipidmaps dataset organometallics dataset
Project updates	AMI	Past week Sage 2.2.1 + S data + TM data/splits result P angle not improved (Conclusion): re-fit with P angle frozen, lower priority for now, could investigate splitting angle in future DDX error analysis Got it down to 5% Need to delete records from dataset and resubmit them with a new guess Probably can’t do it through QCA-DS AIMNet lit search Next week More on DDX dataset--re-submit stragglers with new guess, start optimizations Look into NAGL architecture Look into testing dataset
LW	Past week: CSIRO talk Evaluator on NRP! Interchange packing and simulation (+Evaluator) Next week: Protein stuff? Evaluator (virtual sites) Interchange follow-upsWrapping up PRs (QCSubmit fix, etc) Resubmitted hessian dataset Monitoring QCA workers MLPepper ran in 2 days Lipid maps is going slowly as it’s not split by size – the resource requirements vary. Also, scaling deployment > 60 workers means everything crashes yammbs PR Benchmarks PR openers upload: a yaml file with config Optionally, a FF file Things we could review: CSV files that get committed to repo Human input: check they got uploaded LW: can script generate plots for you? BW: that’s possible Zenodo submission (which is now on production OpenFF) Note: YDS doesn’t publish entry. To review Zenodo, you need to be able to log into Zenodo. Someone needs to review and hit publish, to include DOI, to include in PR. BW: picturing DOI as a line in a README AMI: is the idea to review before the run starts or after it ends? BW: afterwards – there’s not a whole lot to review before the run. Just the yaml file and maybe an FF. AMI: there’s not that much an external person could review afterwards either, other than checking the CSVs are there. Don’t really see downside to requiring a review BW: just worried about best practices. Also need write permissions to trigger bot. Next week More benchmarks QCA workers Split up second lipidmaps dataset Separate PRs into the same directory Organometallics dataset yammbs PR(s) YDS Update scripts with plots Publish Zenodo entry and merge PR Which FFs? 1.3.1 2.0 2.1 2.2.1 experimental FFs
Project updates	AMI	Past week mamba issues DDX errors looked into functional group breakdown Also resubmitted same dataset without diffuse functions. No errors. NAGL2 optimization dataset planning Decided to do geom opts at usual level of theory BP: can do 2.5-5 TB dataset. “Let’s see how well we can handle it”. “no problem” to do up to 1 TB. AMI: 5 confs/mol would be ~2.5 TB. 5 TB would be 10 conformers. 2 confs/mol would be 1 TB. LW: aim for 5 conf/mol Follow up on standards document Next week NAGL2 optimization dataset Hessian DSs (Freeze phosphate angle refit?) (NAGL2 testing dataset--examine exisiting coverage and maybe setup opts) Finish standards
	LW	Past week: Working with various Kubernetes issues Troubleshooting packmol Collaborations and admin Next week: Protein projects

Versions Compared

Old Version 1

New Version 2

Key

Action items

Decisions