MT – I tried doing partial charges for a host system - I gave up with the ToolkitAM1BCCHandler and instead did chargeincrementmodelhandler with formal charges. So I’m curious about the plans for loading nagl models.
LW – Re: Nagl #26 - I was writing a big response, but I'm still working on that.
LW – You should be able to use the models quite soon. Right now they’re stored via pickling. They’re about 10MB a pop. Out of curiousity, why are FFs in a different repo?
JW – It’s just to keep the release cycles from getting mixed up. We could have worked around this but that’s the choice we made at the time.
MT – I think it’s fundamentally superior to have microrepos. It’s not something that could just “be worked around” - It’s better in principle to do microrepos.
JW – Not relevant here, but I’d kinda be in favor of a giant monorepo. I seems like we’d get rid of all the pinning complexity and we could trust that green tests really are green. And google does a monorepo.
MT – LW, what are your thoughts about models with vs. without the code?
LW – …
MT – Choice of picking is interesting, though it does seem more compressed than a yaml or something.
LW – I’m doing pickle because the pytorch default is to pickle. JM was using load_from_checkpoint
in his docs work - This seemed slightly better because it contained more info. But it’s 32MB. Or I was thinking about making our own standard of yaml+model weights.
MT – Is checkpoint file is somewhat standardized in pytorch world?
MT – This reminds me of GROMACS/MD engine checkpoint files, which aren’t necessarily enough to reproduce.
LW – I’d think about doing hyperparameters in yaml and the weights in pickle.
MT – It could make sense to make a .nagl
format, though then we’d need to define our own format.
LW – With pickling mode, they’d need to do use nagl to read weights anyway…
MT – How would the size of model files compare to pytorch, mkl, etc? Seems like they’d be a lot smaller than just the deps.
LW – Agree. We can reduce size of deps, but it’s hard to get rid of pytorch.
JW – Does DGL have a serialization format?
LW – 1) No and 2) I wanted to get rid of DGL as a dep for inference and 3) ONNX is explicitly not compatible with DGL.
MT – Some perspective from how FFs are loaded: It’s bad that the FF loading machinery recursively checks directories until it finds the first thing with the right name. Also a question - If the model repo size gets big, will old models start getting dropped?
LW – I don’t plan on dropping old models.
MT – Where can I find an example model file?
LW – Nagl PR #26 - This is very pre-production but it’s there.
MT – Numerical issues, not error issues, right?
LW – Yes, it works pretty well for normal small mols + proteins.
MT – Great. I think I can move forward with this then!
MT – Will nagl ever do things other than charges?