| |
|---|
Needed permissions | |
Necessary steps | git clone git@github.com:Yoshanuikabundi/proteinbenchmark-nrp.git
In proteinbenchmark_jm_template.yaml, replace all instances of jm with cc, except those in jm-bucket and jm-rclone-config In the python script, change jm to cc JW needed to run mkdir -p results/gb3-null-0.0.3-pair-opc3/replica-1 to get single replica working python run-umbrella-windows.py
(after runs complete, shown by “Completed” status, the pods are kept around so that logs hang around) kubectl delete pod <pod name>
|
Questions | Which things should we get set up for ourselves? JM – probably fine to use my secrets, would be redundant for everyone to make their own secrets that are visible to everyone else. Incined to reuse my secrets until something goes wrong. JM – Current shared secret is just the S3 key. That’s fine. JW – How long do persistent volumes last? JW – Do results returning to our computers delete them from S3? We’ll meet in this timeslot again next Weds and look at storage usage. CC – Looks like one replica (31 windows of 500 ns each, starting from different points) is 14GB, so 3 replicas is 42 GB.
How can we run into problems as fast as possible? Who is responsible for running? JW – Could have CC be principally responsible and reach out to us when needed, or have CC send JM the inputs and have JM be principally responsible for running. CC – Prefer running myself and reaching out for help when needed. Can overlap with JM in evenings PT. JM – That makes sense. Should we set up a dockerfile pipeline? Should we move repo to openforcefield org? JW – Let’s move this repo to our org, use the dockerfile in it as authoritative, and have a workflow that is only ever manually triggered to update the docker image. LW – Should we continue using ghcr? Or move to nrp container registry? JM – nrp container registry isn’t always faster - Sometimes things are stored physically far from where they’re run. JW – I think ghcr is fine to continue using, doesn’t seem to be charging us for throughput.
|
Getting new umbrella starting points schlepped to NRP | |
To do items | JM will move repo to OFF org JM will have both proteinbenchmark repo (and NRP-running scripts?) always be pulled from github (done) CC will update scripts in repo for butane runs CC will start butane validation runs on 10 GPUs JM will change docker image builds to happen manually and pull from dockerfile in repo Everyone will monitor GPU utilization and post in DM thread if something’s up Next weds at 4 PM Pacific we’ll discuss storage usage
|