| CC Working on following up with stragglers on LiveCOMS review Tried to submit dipeptide dataset on QCFractal on Fri, botched my conda environment, I think I’ve got it fixed so I’ll try again this afternoon Working on getting library charges for protonation states of amino acids. Having trouble with OpenEye. It’s probably from confs that have hbonds on carboxylic acid groups of side chains. JW – Ran into the same thing last week with bespokefit. I’d like to have a way to make it fall back to a nicer behavior if all the carboxylic acid confs are bad. CC – Yeah, I’m trying to think of a manual fix or some other way to handle this. LW – I used rdkit when I did this and didn’t have a problem. I basically used Simon’s ELF code in the toolkit for this. JW – I’ll keep thinking about a general solution for this – Will contact you if I have a fix.
MT General PR wrangling Non-biopolymer non-ParmEd example in toolkit. Found a really odd amber error that causes a memory access error when the number of residues in a prmtop is mis-set. Docs improvements – Users guide, changelog, other fixes. Upstream beta/RC tests. Testing against OpenEye and OpenMM RCs. Already caught an upcoming breaking change OpenMM. Bringing units package to to parity with Evaluator units. Some edge cases, eg OpenMM and SI disagree on what a “dalton” is. Polished and made a new release.
Feedback from protein-ligand example. Mostly requests for improvements in system building experience. Think we’ll need Topology.from_pdb to load multi-protein and water files. MolSSI MMIC proposal. MolSSI recommended that we take the importers and exporters out of interchange and make them MMIC components. I declined this offer. CC – Could you provide details? MT – The project isn’t well adopted and doesn’t have a clear path forward. Getting this tied in with an uncertain effort would be a huge liability for us. JW – Agree. QCFractal appears to be moving forward despite MolSSI, not because of it. I’m worried that MMIC would go the same way. DD – Agree that it’s a big liability to tie ourselves in with something that’s externally developed. If they gain more adoption in the future we can always revisit this.
Started moving stack to M1-compatible builds. Released mdtraj with M1 fixes. We’re still waiting on AmberTools and its dependency chain (parmed is done, packmol seems unmaintained, netcdf-fortran is messy). None of this is super blocking – Rosetta can emulate support for existing packages.
DD LW Worked through nonbonded issues and debugged my dataset with Simon; Simon has updated nonbonded. The entire week spent trying to work directly with evaluator and ForceBalance, as pinning openmm=7.5.1 is not friendly to updated packages Evaluator seems to hang after a few (~3?) hours (no update of progress logs; no increase in file sizes; top shows Python processes with <20% or <10% CPU usage; ps aux shows some generic multiprocessing processes). Stopped at iteration 0 of FB, if properties have been computed, I haven’t found where the output is saved. Tried pushing a few switches, which did not seem to really help: Using DaskLocalBackend (which worked fine for training a force field to 1 property for 15 iterations), with varying numbers of G/CPUs requested Using DaskSlurmBackend, with varying numbers of processes requested Tend to quit jobs after ~24 hours of apparent no progress, so burning a lot of Mobley $
SB – I’d check which DASK version you’re using – They’ve had a lot of breaking changes. It’d be good to compare this to the version used for Sage. One is ”work stealing”, where jobs don’t get distributed for no clear reason. There’s also cases where a worker has died and the server doesn’t know to redistribute the jobs. You can check this by looking at the dask dashboard. If you can get this to the point where the jobs are hanging, we can have a quick call and I can show you my process for debugging this. PsiRESP 0.2 is pip installable (
), although RDKit and Psi4 are not. It’s been refactored to use the MolSSI QC stack. Waiting to see if/when QCFractal will patch the issues I’ve raised so I can release 0.2.1 without my own patches – will procrastinate conda recipe until then. Calculates RESP and RESP2 charges Previously: CZI meeting very interesting, lots of tips on managing a research project and on being good science citizens (mostly focused on community building). GSOC 2022 will be very different from previous; no longer limited to students but to all new contributors, longer time frame for more flexible hours
PB Submitted the openmm datasets, took longer than expected. JW – Thanks for doing this, I know it’s not really your job but it could provide a lot of value. PB – Thanks. JHorton has been doing more work than me. SB – Do we have plans for the experiments that we could do with this data? Like alternate targets? PB – We had talked about this but didn’t have firm plans. I can plan to do fitting studies using this, though I’m concerned about the different method/basis. SB – It’d be really neat to see what we can do with this data, and how the fitting looks. There could be some really high-value studies that we could do. PB – When we talk about “forcebalance replacements”, are we thinking of going entirely through OpenFF stack, or updating forcabalance and our plugins? SB – The former. Completely replacing forcebalance withs omething built around pytorch.
Some follow up work related to wbo. Tried looking into torsionnet from pfizer, they released scripts to train a neural net but not the trained model.
JW – Worked with Danielle Bergazin on bespoke fitting for polymer hosts. Proposed a fix for trans-COOH in fragmenter, where we fall back to single-conformer WBO if ELF10 WBOs fail. SB – I responded to that issue. I think we should manually rotate the carboxylic acids in the input conformers if they have that problem. I did this in RDKitWrapper so we can probably lift code from there. JW – There’s some weird OE API stuff where we can access the ELF10 auto-correcter when we do charge calcs, but not when we do WBO calcs. SB – It’s proabbyl best in the long run to have the “fix carboxyliic acids” logic in the toolkit.
Started some work on speeding up biopolymer refactor. Didn’t make a satisfactory amount of progress so I may be quiet today and tomorrow so I can focus on it. Mostly tech support
SB Tidying up infrastructure - Evaluator stuff (upgraded to support OMM 7.6, but not backwards-compatible; leads to some YANK issues; then made a new release) Fix for thermoml changes. Tidying up nonbonded – Instead of exclusively using REST API, now can also run locally. Bespokefit is moving forward - I’m working with JHorton to get it on conda-forge. Also working on multi-stage fits (like electrostatics, then vdW, then other stuff). Also working on where in the schemas people can set bespoke terms to generate. Wrote my own YANK substitute (AbSolv) since it’s been unstable for a while. Also adds support for things like vsites and custom nonbonded. DCole group is working with it and providing feedback. If other folks want to give it a shot I’d love more feedback. It can also do nonequilibrium free energy calcs. Working on “can a GCN understand resonance forms?”. It seems like, in the original vcharge paper, you can average the resonance forms (with some method for determining which resonance forms are reasonable). I’m experimenting with this in the nagl package. .
|