Something change to prevent QCSubmit from being installed? Did NAGL have a recent release?
LW – NAGL did have a recent release.
LW – I have a NAGL notebook that runs fine on linux, but fails on mac CI. Seems to get stuck on a cell that trains neural network. It’s my examples CI run.
Last week had major-ish interchange 0.4 release. Also minor openff-units release that accidentally broke a lot of stuff due to interaction with lazy loading in OFFTK. Also minor OFFTK, QCSubmit, and BespokeFit releases last week.
– Happy to give this a shot. Interesting, that’s not a notebook timeout. Maybe something internal in python/pytorch library.
JM
Will retroactively take time off for time spent on paperwork last week.
JW – Sounds good.
Also would like to take week of Dec 9 off to move
JW – That sounds good, approved. Please request 2 weeks in advance if possible. Also expect a holiday shutdown Dec 23-Jan 1, formal notice coming soon
JW
It’s likely you’ll get tapped in to help with protein FF work, likely figuring out how to run sims on NRP and get results back. LW will be contact point.
LW – Would love to have you involved, seems like this would align well with what you’re already done on NRP. CC is currently limited by GPU compute at UCSD. We’re basically looking to get more compute throughput using NRP. CC is currently debugging his workflow, but we’re hoping to have this ready to go in 2-ish weeks. Workers probably won’t need to talk to each other.
JM – Using evaluator?
LW – No, more direct, just using OpenMM. Hopefully just a matter of porting workflow to Kubernetes.
JW – CC may eventually take over compute management, but I’d like JM to start and get he workflow down and validated.
JM – To put it on NRP container registry you’llneed to ask on matrix chat.
LW will assign tasks to JM on Trello as appropriate, giving them High/Highest priority.
You may also get tapped in on BeSMARTS docs work, will depend on some details on TGokey’s timing
LW: looking like maybe early next year?
JW – And don’t worry if science workload pushed workshops back, that’s a good trade for the org.
JM – And this could go into a workshop.
Last week had major-ish interchange 0.4 release. Also minor openff-units release (0.2.3) that accidentally broke a lot of stuff due to interaction with lazy loading in OFFTK. Also minor OFFTK, QCSubmit, and BespokeFit releases last week.
…
JW – So I think that everything should be hunky dory for openff-docs to test with the latest versions of everything.
We’ve been tapped to go ahead with PTM workflow - This raises priority of substructure loading. So very interested to see how MDA works for PDB loading - It could be silver bullet we need. Likely need to have something rolled out January. If MDA works then I’ll be very happy.
JM – MDA takes seconds to minutes. It has some issues that we could work around in our wrapper. Some issues with small things like calling zinc Zn vs ZN. Errors are a bit mysterious. And I found a few PDBs with errors that MDA complained about (eg serine sidechain with HO equidistant to another O, and MDA tried to give the H 2 bonds). At the end of the day, the MDA loader is guessing bonds from distance, so it will fail in cases where a residue/atom name approach might work.
JW – Are there PTMs in the dataset?
LW – MDA is about to release 2.8.0, with a new guessing structure, which has the capability for extension to be more careful about cases like this serine thing. I think we wanted cases like this to validate/test. We tried to get a grant to do this but failed, so timelines are uncertain. A “PDB” biomolecular context is high priority for if we get resources though.
JM – Maybe MDA could be a good place for my loader - it looks at residue names and downloads reference from CCD. It’s capable of adding missing atoms. However it’s hard to handle protonation states. But if all atoms are already explicit then it’s in good shape.
LW – Unsure about whether MDA would want to take over ownership of this but thematically this could be fitting.
JM – I know OpenFF doesn’t want to take over ownership of something this big, but there’s a lot we can do on this front - Eg, using my atom name matching approach to identify as much as possible, then guessing by distance for the rest.
LW –
JW – Maybe we roll out BOTH solutions for loading (Josh’s tool AND MDA) and we tell users to try them out and give us feedback. And having the census as a fallback will keep us from getting overwhelmed by any specific user’s feedback.
JM – What other toolks are used to prepare proteins?
JW – Schrodinger, chimera dockprep. Could see about getting inputs from those.
JM – I’ve uploaded the results of the PDB preparation to google drive