2021-04-20 Perses Demo Meeting notes

Date

Apr 20, 2021

Participants

  • Dominic Rufa

  • Ivy Zhang

  • @Mike Henry

  • @Iván Pulido

  • @Jeffrey Wagner

Goals

  • Gain understanding of Perses, and how it can be better supported by OpenFF topology refactor

Discussion topics

Item

Notes

Item

Notes

Perses intro

Which OpenFF API points are touched?

  • Mostly use Molecule + Topology and FF parameterization

  •  

Desires for future refactors

  • How are residues modified?

    • DR – Use OETK to get maximum common substructure

  • How are mutations+conformations generated?

OpenFF’s plans

Improve molecule, topology, and forcefield so that:

  • Users can load PDBs and parameterize a protein using a SMIRNOFF force field in under a minute

  • Users can access PDB-style hierarchy info for OpenFF molecules that have it (so, like, iterate over residues and stuff like that)

  • Allow some molecule modification/unnatural amino acid handling – Add chemical modifications to canonical amino acids

SO OpenFF wants to know

  • What do your protein-modifying workflows looks like?

    • DR + IZ – Generic way is using the GeometryEngine class, where we take the System of the residue/small molecule we’re modifying… So we make a model that only uses the valence terms of the modification, and guess the geometry using the equilibrium length/angles/torsions. This process first guesses positions of heavy atoms, and then does the hydrogens. The initial guess has a lot of clashes, but you run a minimization before the sim, and generally get a good starting point.

    • IZ – Big pain point right now is bringing in GLYCAN parameters We can’t guess the coordinates of GLYCAN-parameterized residues because we can’t convert it into an OpenMM forcefield. so we have to use tleap, which requires connectivity and coordinates, but this is chicken-and-egg, since we can’t get the coordinates without the system.

    • JW – Which tool is used to modify the topology?

      • DR – We use MDTraj topology to modify the topology.

      • IZ – We make the hybrid topology in MDTraj, where both chemical states are present. So, the endpoints are OpenMM topologies, and the hybrid topology is an MDTraj topology.

      • HybridTopologyFactory

        • IZ – Specifically, this may be particularly informative

      • JW – If we tried to replicate the functionality of the hybrid topology, which API points would be needed?

        • DR – Would need access to:

          • For each atom, know whether they’re from the “old” topology, “new” topology, and if they have a mapping, provide that.

          • Be able to query Bond/Angle/Torsion/charges/LJs of the old and new topology, and know which ones are identical (defined as numerical identity – they could come from a different source).

            • DR – Eg:

      •  

    • (For JW and IP – How will we handle residue perception in glycosylation/glycan chains)

  • What do your residue-accessing needs look like?

    • IZ – Access residue/atom by index.

  • How do we track overlapping substructure through molecule modifications?

    • DR – There aren’t hard-and-fast rules for getting the “best” substructure mapping. MCSS is just one approach, others might be more performant. So don’t hard-code this.

    • IZ – Eg, in a transformation between histidine/tyrosine/tryptophan, we find better convergence if we don’t map the sidechains at all.

  • Would Perses devs want a way to get a “hybrid” combined system at the end of parameterization?

    • JW – This would be very hard for us, but we could consider it if it’d be super helpful

    • (Perses devs/general) – This wouldn’t be a big help, so don’t worry about it. But a hybrid Topology would be great.

      • JW – This could be provided by our planned AtomTypedTopology class, but it won’t have any chemical functionality, it’ll just be equivalent to a fancy networkx graph that might hold coordinates and read/write to/from PDB.

Action items

Decisions