2021_12_08 Thompson/Osato/Wagner Check in

Participants

  • @Matt Thompson

  • Meghan Osato

  • @Jeffrey Wagner

Discussion topics

Item

Notes

Item

Notes

Notebook feedback

  • MO –

    • Gromacs output file writing takes a long time

    • In one case, there were several molecules all grouped into (chain A, residue 150), and this messed up the output PDB, and the output prmtops.

      • Possible thing to work on today

    • I made an openff top from the omm top from modeller, but then I wasn’t sure if this was somehow tainted by the use of openmm’s modeller, so I wrote it out to pdb file, then I made an openff topology using from_molecules(water * n_waters...).

      • JW – It’s kinda redundant if we have to parameterize the whole system using openmmforcefields to place the solvent, and then we drop it back out to an openFF topology, and parameterize it again.

      • MT – It looks like modeller.addSolvent just needs vdW radii for the solute atoms. Could we just feed in something really crummy/element based to get vdw radii?

      • JW – Maybe PDBFixer is what we want here? It’s similar to an openMM modeller but it doesn’t seem to need force fields for all the components.

        • Possible thing to work on today

    • Confusion between topology positions, attributes called xyz, molecule conformers.

      • Could continue conformers/positions/xyz discussion today

    • Some things only worked because Sage had tip3p parameters built in.

    •  

  •  

PRs

  • JW – OFFTK #1151, 1153

    • MT – I already approved 1151, and reviewed 1153 live and approved.

    • JW – Thanks! Merged both.

  • MT – Nothing new from me

 

  • MT – Spent a decent amount of time yesterday working on from_openeye - Most of the time was spent dealing with units. Eg, there’s an add_conformer method, and all that does is call _add_conformer, and that makes it unclear when to check units/dimensionality. So the current master branch takes 20-30ms to load a C100 mol, and the refactor branch with the units refactor takes 150 ms. So I’m looking into this more. It seems like my_array * unit.nanometer is slow, but Quantity(my_array, unit.nanometer) is fast.

    • JW – This is great to know. I’m interested to keep up with these findings.

  • MT – User defined data on molecules vs. stuff forcefields will generate (in short: partial_bondorders/charges_from_molecules

    • MT – Three options for interchange:

      • Don’t support this at all/you need to fork our code

      • Reach parity with the Toolkit

      • Include charge_from_molecules and friends in a spec

    • JW – I think that something like option 2 is best: Basically, allow charge_from_molecules, but have it just convert to librarycharges under the hood….

    • JW – Well, actually, we could instead do something like letting a Topology continue to have partial charges and partial bond orders, and have kwargs for create_openmm_system like overwrite_bond_orders_from_molecules=True, where it can be set instead to a list of molecules that won’t have charges recalculated

    • MT + JW – A big concern is what to do if the propertorsionhandler wants one partial bond order model, but bondhandler wants another. What gets written to the resulting topology?

    • JW – There’s sort of two axes - How much do we let users/scientists increase the scope of our implementation, versus how much do we stick strictly to the spec? If we trust them to be reasonable then this could be affordable, but if we have to provide several custom implementation-level things a year we’ll wind up with tons of debt. Secondly, How much do we jump to implement for them in major tools vs. how much do we make them implement in their own fork?

    • JW – Maybe we should have a ChargeFromMoleculesHandler?

      • MT – Could be promising, but then would our molecule spec become part of the smirnoff spec? Would this somehow avoid whole-molecule graph matching?

    • (General) – These are kinda grasping at straws for a question that we can’t define. It’s kinda 3-fold:

      • Human: How do we provide enough funcitonality for scientists so that they use our stuff and don’t publish using wildcat implementations?

      • Technical: How do we make this in a way that’s computationally possible to actually do (eg no whole-protein librarycharges)?

      • Scientific: When are two implementations equivalent enough ?(OE vs AT?)

  • JW – Re: Idea for a workshop to showcase the Feb release - This is a good idea, I just don’t have the bandwidth to lead it.

    • MT – I could take the lead on this.

    • JW – Awesome - Please do. Thanks!

 

 

Action items

Decisions