/
2021-10-25 Core Developers meeting notes

2021-10-25 Core Developers meeting notes

Participants

  • @Iván Pulido

  • @Matt Thompson

  • @Chapin Cavender

  • @Jeffrey Wagner

  • @Lily Wang

  • @Pavan Behara

Discussion topics

Item

Notes

Item

Notes

Updates

  • IP

    • Met with JW to talk about the from_pdb api point. Reviewed current state of PR, looks like lots of monkey patching, some tests failing

    • Worked on loading metadata from PDB. Right now we lose it in RDKit roundtrips – This is going to be a problem because of some operations that don’t preserve order. We sanitize using RDKit after graph matching, and that may lose information about the order during this process

      • JW – We should guarantee the same atom order in to/from rdkit, though it WOULD be better to preserve the metadata entirely.

  • CC

    • Worked on LiveCOMS review. Most of the NMR section is in, but I still need to follow up with some people to get their sections in.

    • Worked on library charges for amino acids from Rosemary. Getting hung up on which SMARTS to use for librarycharges – Sometimes will wind up with non-integral formal charges due to using different resonance states.

      • Eg, if I set up 5-mers and swap out the resonance states used in each residue, then OE gives me somewhat different charges.

      • JW – Possibly something with conformer generation? IF OE isn’t treating resoance states equivalently the physics parameters during conf gen may be different for each resonance state.

      • CC – Seems to be due to resonance states in which a formal charge is delocalized, NOT different kekulizations of an aromatic system.

      • CC – Chatter about this problem says to canonicalize the SMILES before processing the molecule, which makes sense to me.

      • JW – In files like the amber FF port, to assign equivalent parameters, we use wildcards (Ctrl-F in this file)

      • CC – Need to figure out WHICH resosnace state(s) of, eg, ARG to use for charge generation for the FF. Thinking an average of the possible states?

      • JW – That’s a good idea. I can work with you on this later this week. We’ll want to make sure we standardize this resonance handling in one place so other researchers use the same thing.

  • MT

    • Short week.

    • .prmtop writer now works for single molecules (from SMIRNOFF).

      • Worked on AMBER export last week. prmtop writer is done for single-molecule systems. Covered all the tricky improper stuff, nonzero idivf,

      • .inpcrd writer already existed, so now can do energy tests with Amber

      • Can’t yet process proteins, but (using the feature branch) does not seem to be far off.

    • Got a lot of “here’s how we use ParmEd” use cases, mostly from Mobley.

    • Started porting Psi4 upstreams. Still assessing whether this is possible/how difficult it will be.

      • Most of my concerns are in a “technical” category, where we can engineer around it, but they’re not “clean” technical problems. One tricky thing is that LBurns has lots of pins to make her packages work, but these aren’t conda-forge. So there may be version conflicts on major things like mkl (LBurns pins 2019, c-f demands 2021).

      • Also the test/compatibility matrix isn’t super well defined, so I don’t know how many permutations we need to (or even CAN) test. So this will require some scoping/research before we go too deep. Will keep working on this.

      • MT – Could help to know what OpenFF needs – Windows compatibility (probably not)? Mac?

        • JW – We’ll need psi4 devs involved to determine scope/responsibility.

      • SB – JRGuerra doesn’t work for us either more, but he may be able to make an advanced start on this. So we may want to chat with him to gauge interest/ability in taking on this work as part of a contract with QuanSight.

        • MT – I’d want to check with LBurns about this first

        • JW – Agree – Maybe a synchronous chat with LBurns to see if this is of itnerest at all, and if so, a synchronous chat with LBurns and JRGuerra.

  • LW

    • Difficulty training a force field – nothing programmy currently but jobs dying or exceeding time limit a lot, perhaps I should be starting from better estimates or asking for different resources? Currently only asking for free, preempt-able time

      • JW – If this is on the order of a few hundred/thousand dollars, then DM or OpenFF would probably be willing to fund paid queues for this. Please DM DMobley and me on this topic.

      • CC – I’ve found that, if I bump up the regularization strength a bit, that can accelerate convergence.

        • LW – Thanks. I haven’t tried that yet, just used the values from Sage.

        • CC – I get the sense that the regularization parameters weren’t tuned/experimented with yet

        • SB – Yeah, we’re still just using what HJang and LPWang plugged in intially. PB may have done some investigation on this.

        • PB – How many workers are you using?

          • LW – 40

          • PB – How many steps are they taking? This could correlate with pre-emption.

          • LW – Can’t count this at the moment, will look later.

        • SB – Valence fits or nonbonded fits?

          • LW – Nonbonded.

          • SB – These will take a while, I usually request 60 GPUs. The sage fit took about a week with this resource level.

          • LW – My experiments are using a much smaller set of observations

          • PB – UCI cluster doesn’t have a lot of free GPU resources.

          • LW – I could reduce the set even more.

        •  

        •  

    • Gave group meeting presentation on work up until now comparing AM1/BCC/-ELF10 charges to variants of RESP, finding that the AM1xx methods vary much less in magnitude along a chain, but differ more from RESP2 charges than other RESP variants

    • Interest in general openff.toolkit.topology.Molecule.from_qcschema? from_qcschema does not currently work with a typical qcschema molecule

      • JW – Worried about having a true guesser in our namespace.

      • LW – QCElemental has connectivity field with bond orders

      • JW – Oh, that helps a lot. Still a little concerned about corner cases where atoms can have multiple possible valences

      • SB – In our workflows, we always know the input graph, so this shouldn’t be a scientific issue for us

      • LW – I have had workflows where I used pure QCEl molecules for a long time, and I had to reach waaay back to the original chemical graph to get an OFFMol out at the end.

      • SB – The API point may be poorly labeled, or the error message may be able to be clearer about what the user needs to do to fix it.

        • Error message:

      • JW – We should improve this error message and link to relevant API points that make suitable QCEl/QCSchema molecules. Unfortunately most people will hit this at the END of a long workflow involving QC calculations, and will have to go back and recompute everything, so it will be hard to provide them this information at the beginning of said workflow.

    • Only working after 2 pm Pacific time Tues-Fri next week due to CZI workshop / GSoC meeting ; will see if I can duck out for meetings like the biopolymer ff

    • This week: pushing out Psiresp (for calculating various RESP charges) as a real package with docs and things, having difficulty working with qcfractal/qcelemental as a temperamental stack so a minor refactor needed to accept properties computed by other means

  • PB

    • Worked with DD on improving the throughput on some of the theory benchmark calculations. Also, submitted some additional compute specs for that. Still waiting on a new qcfractal release, a fix for an issue with some b97-d3 specs was put in but a release was not made yet.

      • JW – It’d be good to ping BPritcahrd on this, but I think he usually delegated to DDotson (who’s out this week), so we may not get the release this week.

    • Sync up with SB/OM/DM on sage paper, will work on moving it this week. Also got some feedback from Hyesu on the theory benchmark draft, will work on finishing this and seeking formal review.

    • Ran into some segmentation errors with my forcebalance runs and went down a spiral of deleting and rebuilding conda envs, seems to be an issue with node failure (all of the fails were on the same node). Did some fitting experiments, will be working more this week. Also, had a small working session with Trevor on deleting the micro outputs during a forcebalance run, SB introduced a retain_micro_outputs option but it didn't work for me as the current working directory during the run was somehow different, didn't make a PR though.

    • JW – Maybe we could also put this in with LWang’s request to use paid queues

      • PB – DMobley already approved some use of paid queues for this

    • PB - Do nonbonded fits and valence fits have similar resource requirements?

      • SB – Nonbonded fits are way more expensive and basically require GPUs. We could invest in GROMACS support to get its better CPU performance than OpenMM (I think, not confirmed), but it would require a bit of work in evaluator.

        • MT – I’d believe that GROMACS has better CPU performance that OpenMM

  • SB

    • Working short hours, mostly meetings.

    • Small technical stuff, nothing very emorable

    • Worked with JH to refit FF with an alternative vdW form. This alternative vdW form is much better suited for switching FE calcs. So we’ve been doing some fits, but we’ll need lots of compute to do FE calcs. Also our current infrastructure like YANK doesn’t support switching functional forms, which makes things hard, so I’ve been making some new infrastructure to do this.

      • Looking into how Interchange would be able to fit into this sort of experiment. Not sure how it could fit (or whether it’s in spec) because of the range of things I’m trying, and the need to interface with simulation engines. Would be happy to discuss this with MT.

      • MT – Would love to discuss. Agree that SMIRNOFF’s lack of support for non-12-6 potentials is troublesome. But it does seem like there would be a lot of downstream changes needed as well.

      • SB – Agree. One hard thing is that, looking at a parameterized system, it’s hard to know what’s representing the vdW terms. There’s nothing in the class structure/architecture that indicates what’s a vdW force. And so it’s be hard to do certain things like duplicating forces.

      • MT – Agree, our lack of hierarchy/structure in the nonbonded forces is going to be tough.

      • SB – I could imagine changes to the SMIRNOFF spec that would really help, but they’d be big changes.

      • MT – Experimented with Buckingham potentials earlier in Interchange. Made a potentialhandler from scratch for it, and the output logic was really complex (just trying to figure out whether a LJ handler was there, or if a buckingham handler was there, or what else might be around). So the SMIRNOFF spec could be changed to explicitly answer this question.

      • SB – Agree. Also, the SMIRNOFF spec says that ES and vdW could have different cutoffs, but that’s 1) not actually possible in some engines and 2) going to screw with FE calcs. So if there’s a one-to-many mapping from Interchange PotentialHandlers to OpenMM Forces, then it’d be great to have a record of which PtoentialHandlers affected/made which forces, so that the provenance can reveal how values should eb propagated by downstream tools.

      • MT – This is great to hear. Interchange currently does something like this, but it’s not super well-structured and deals with a lot of complexity. So it’d be good to hear more about specific needs. Especially when implementing an OpenMM IMPORTER, it was really messy to see how to tease apart a nonbondedforce into underlying potentialhandlers.

      • SB – That’s promising. I’d like to discuss more how Interchange.to_openmm_system could return the provenance from PotentialHandlers to Forces – Like some sort of map object.

      • MT – This gets complex because it’s not all one-to-many mappings – There are some many-to-one

      • JW – One thing that makes me nervous about this is situations like GBSAHandler also requiring info from electrostaticshandler

      • MT – I’ll keep documenting the needs/complexities here and see if a big picture of a solution starts to become clear.

  • JW





Action items

Decisions