2022-01-03 Core Developers meeting notes

Participants

  • @Jeffrey Wagner

  • @Pavan Behara

  • @Jeffry Setiadi

  • @David Dotson

  • @Simon Boothroyd

  • @Matt Thompson

Discussion topics

Item

Notes

Item

Notes

Updates

  • SB

    • Graph convolutional NNs - Worked on Nagl, a lot like espaloma. This is on conda-forge now. This is in a good state to do charge model fitting. Can also handle resonance form enumeration, so it shouldn’t get hung up by graph issues. Looked into generating some electrostatics issues. Was going to make a dataset using enamine/zinc/etc, tried doing HF-6-31G* calcs, took a really long time. So I’m using some of RDKit’s automated fragmentation to reduce things from ~25 heavy atoms to ~8.

      • Resonance enumeration is charge-transfer enumeration, not kekule-based enumeration. It’s easy for carbon-based resonance, but not heteroatom resonance. This implementation is based on the Gilson/vcharge approach, looking for SMARTS for groups with known resonance. One downside is the combinatorial explosion in this approach, but clever averaging can get around this. Also resonance systems will never span sp3 carbons, so that makes it easier to throw out atoms and split the biopolymers into self-contained resonance systems.

      • PB – Details of why charge calcs are taking so long?

        • SB – Tried running these with convergence criteria “gaussian-loose”. Using ~16 workers, only about 100 calculations finished in a few days, whereas I was hoping to get data for thousands.

    • Working with JHorton on whether we can swap out LJ functional form for something better (like double exponential). We’ve done some test fits and different funcitonal forms can do BETTER than existing LJ. But it’s hard to try this out on things like YANK and other tools because they don’t support different functional forms. So I made AbSolv to enable testing these.

      • Generally, I’m not sure what the status of YANK/OMMTools is. Unclear maintainership/responsibility.

    • As an aside, I like to look through big molecule datasets, but it’s hard to generate and look through giant PNGs. So I made a REST-based approach where you can visualize lots of molecues in the browser, sort them, filter, etc. I can share this if others are interested

  • MT

    • Was off the last week, half-time the week before.

    • Worked on syncing up refactors - especially units - across codebases. Discovered some corners of interchange with old unit-handling logic, uneven test coverage. Will continue working on this.

  • DD

    • QCArchive

      • PRP had a major outage right before Christmas, and so we lost a lot of potential compute over the holiday (DNS issues at first, then pod management complexity, it’s back up now)

      • also have to limit usage of Lilac to avoid filling up disks; working on finishing this PR this week to get a QCFractal release out by next week

      • transitioning role of managing qca-dataset-submission, performing submission reviews, etc. to Pavan;
        will operate as support and continue to manage PRP and Lilac compute, but will take marching orders from Pavan on dataset operations

        • Pavan will be responsible for choosing dataset priority, choose compute tags for dataset routing

      • continuing to work closely with Ben; he is making good progress on QCFractal refactor

    • PLBenchmarks

      • picking this work back up this week; need to bust out draft of narrative doc so we can advance this forward with AWS folks

      • showed off architecture at our last core devs

  • JS

    • Working on host-guest optimizations. Making good progress on GBSA optimizations. Should be able to show results near the end of this month. Expecting the parameters to get better for binding. But fitting GBSA parameters for binding screws up HFE results. So MGilson and I came up with a new functional form and we’re looking into it.

      • JW – Is this using the new GBSA plugin that we worked on?

      • JS – No, using standard OBC, just changing per-atom parameters.

    • JS – Are other folks doing calculations with OFF GBSA?

      • JW – I’m not aware of other people doing GBSA work

      • SB – Only other person I know of is WWang, but JS would know more about that.

    • SB – Would love to hear more about fitting experiments. When are you planning to present results?

      • JS – I’d love to present this at an FF release call – I’ll contact PB or post in the ff-release channel to get on the agenda once I have results to share.

  • PB

    • Submitted a dataset with protonation states that are enumerated differently than those in the gen2 optimization sets.

      • SB – Did you manage to get pKaTyper working? I had looked into this but failed to get it working.

      • PB – I just used the “reasonable protomer” functionality from OE. I didn’t use pKaTyper but I saw it in the OE docs. I can give it a shot and report back.

      • PB – Whatever function we’re using to enumerate protomers in the OFF toolkit using the OE backend, it may use pKaTyper. (OFF code here).

        • PB – Another option is https://git.durrantlab.pitt.edu/jdurrant/dimorphite_dl/ , but it uses apache license

          • SB – When I used checkmol before, it was LGPL or something, and I asked the author to relicense under MIT, and he sent me a copy that was MIT instead of LGPL. So that’s another option for us moving forward.

          • JW – I worked with Jacob Durrant for several years and he’s super nice. Let me know if we want this under a different license!

        • DD – What’s wrong with Apache license? MDAnalysis is GPL2

        • JW – I don’t remember the details, but my recollection is that it’s not suitable for our work (we should only use MIT or BSD-3clause licensed deps). I need to look back into this.

        • MT – There was recently a court case in Italy about whether BSD is infectious, and I think it was

        • SB – Have we audited our dependency tree to ensure that everything’s compatible with our licensing guidelines?

        • JW – We haven’t, and I’m not sure which licenses are “compatible” with our guidelines.

        • MT – A few things:

          • It’s not clear that there’s anything wrong with having non-MIT/BSD deps

          • MT+SB – There’s one accelera package (a torchMD dep) that was looked at in some of JHorton’s work that may be restrictively licensed, which we may want to be careful about, since it may be academic-only. To the best of our understanding we’re not using this in production code.

        •  

    • Coordinated with Meghan Osato on torsion multiplicity, looking at the errors on torsion profiles for offending parameters.

    • Helped Jessica Maat with doing a fit on the cluster with her new improper parameters.

  • CC

    • I am traveling home today and will miss the core devs meeting, so here's a brief update.

      • Dipeptide TorsionDrive v1.1 (constraints on sidechain dihedrals for rotamers) has made a lot of progress. Recent jobs are taking a lot of wall time per job, unsure why.

        • DD – I’ve prioritized that dataset on lilac, and CC is running managers on TSCC.

      • Dipeptide TorsionDrive v2.0 (one rotamer for each of 26 sidechains) was not submitted because of errors reading carboxylate groups (Asp and Glu) from mol2 or sdf files with the OpenFF toolkit. This seems to be an issue with OpenEye's molecule input, which places a formal charge on each of the carboxylate atoms so that the total charge is -3. I will try to read these molecules from SMILES instead of mol2/sdf for this dataset, then make an issue for the toolkit tomorrow.

      • Pushing harder for remaining LiveCoMS section this week



  • JW

    • Worked on code cleanups, hierarchy metadata, and tests for biopolymer refactor. Hard to figure out where everything should go, still thinking about how something like Topology.from_pdb or from_openmm could recognize both biopolymers AND small molecules.

    • Handled incorrect vsite orientation issue (#1159) with PR 1160. We can make OFFTK 0.10.2 release ASAP.

    • Offline for a few hours Tuesday Jan 4 – Paperwork to do in Irvine

    • Will be working to get Diego up to speed on project management at the end of this week

      • DD – What’s Diego’s job description?

      • JW – I haven’t seen the exact doc, but it’s approximately

        • Keep pharma up to date about planning, milestones, deliverables

        • Manage roadmaps, check in with stakeholders and update timelines

        • Keep team morale high, organize workshops and meetups

        • Guide hiring/onboarding

        • Lots of boring back office stuff

  • PB – Is anyone going to OE CUP this year?

    • JS – It was a lot of fun the other year, before covid

    • JW - I’d consider it, want to see how omicron plays out.

Action items

Decisions