2022-08-23 Protein-ligand benchmarks meeting notes

Participants

  • @Diego Nolasco (Deactivated)

  • @David Dotson

  • Jenke Scheen

  • Levi Naden

  • @Iván Pulido

  • @John Chodera

  • @Lorenzo D'Amore

  • @Mike Henry

  • @Jeffrey Wagner

Goals

Discussion topics

Item

Notes

Item

Notes

fah-alchemy - current board status

  • DD:

    • fah-alchemy : Phase 1 - MVP

    • 1 week out from 9/1 deadline for biopolymer benchmarking

      • we are not going to have a deployed system in place by this deadline

    • 5 weeks out from 10/1 deadline for ASAP

      • we have some chance of making this deadline and having a minimally-usable system deployed, but will be tight

        • JW – I’d approve pushing the OpenFF deadline back to be the same as the ASAP deadline (Oct 1). OpenFF needs are less “something that runs” and more “publication quality runs”

        • DN – Agree

        • JC – ASAP needs are basically “something that runs”, Oct 1 isn’t the end of the world but it would be really nice to have something for then.

        • DD – Ok, I’ll work with you to define what files I need in which format, etc… to make sure that we can launch the ASAP runs as soon as its ready.

        • JC –

        • DD –

        • JW – concerned about gufe#45; wasn’t aware there were showstoppers here; can’t delay release further, and can’t make breaking changes again very soon

          • MH – multi-molecule support fairly new concern;

        • (General) – JW thinks that it’s time to cut the 0.11.0 release, icodes will be there. Topology.from_pdb won’t be in the initial release but can be incorporated into a minor/bugfix release whenever it’s ready, even if it’s just days/weeks later..

    • gufe#36 in review; need to unblock perses#1066

      • DD – MH, IP: can we meet after this call for a working session on #1066? Aim is to:

        • review how the Protocol system works in gufe@dag-rework, what features it provides, what constraints it operates under

        • walk through how nonequilibrium cycling is currently implemented in perses via openmmtools; what would it take to implement this as a gufe Protocol?

        • establish roles for perses#1066; who should drive implementation, who should review?

        • IP + MH – Yes, we’ll meet immediately after this call

    • also working to resolve gufe#42 (@David W.H. Swenson ) and gufe#45 (@Richard Gowers ), which will complete coverage of our GufeSerializable approach across existing gufe objects

      • (DS, in Slack) – Also not attending -- update on gufe#42 is that it is waiting on gufe#45, and we think we have a path forward for #45, but I won't be implementing it this week. We might ask @Benjamin Ries to put it together, if he has time. Otherwise, it'll get done after ACS. I'll try to do a review of gufe#36 in the near future; might be Wed night/Thurs or even Fri though.

      • JC – …

      •  

    • @David Dotson switching development effort to Executor (service API), Scheduler (compute), and ResultServer (storage) for FAH. This will be the focus of my effort for the next several weeks.

      • JC – Re: docs for original inputs and outputs - Are those API docs online? It will be important to have a reference draft of the API docs.

        • DD – I’m not sure how to proceed with this/this crosses jurisdictions in Perses and fah-alchemy. Base class will be in one and implementation in another.

        • JC –

        • DD – Actual implementation-level methods in base class are like BaseClass._create and have no docstring. Then the child classes implement the virtual function ChildClass.create (no underscore) and have to fill in the docstring.

        • JC – Are there dev docs for this pattern? There should be dev docs somewhere.

        • DD – I’ll add those.

        •  

      •  

      •  

ProtocolSettings taxonomy update

  • MH – Spent much of my time last week getting ready for ACS. So I should be able to move quickly this week. I’ll ask for DD and IP’s feedback offline this week.

    • DD – Sounds good.

protein-ligand-benchmark - 0.3.0 update

  • MB (out of office this week):

    • status of #52: are we blocked anywhere? Do we have docked structures with assay conditions, or should we proceed with what we currently have right now?

      • need CLI tool that produces prepared target, docked ligands

    • JC – Any communication with MB, IP?

      • IP – I worked with MB on automation for iterating over files and calling Schrodinger tools.

      • JC – Could you reach out to MB and ask if you can help it along?

      • IP – Will do

    • DD – I can’t remember whether the remaining items are all about just preparing at the appropriate pH or if there’s more to do. But we’ll need to talk to MB to know for sure.

    •  

Mixtures Inchi (MinChI) as a an approach for specifying assay conditions

  • JC – Re: Issue on PLB #59 - I was wondering whether there were existing standards/methods for specifying assay conditions. It looked like mixture inchi was promising. Unfortunately inchi is “the implementation is the spec”, and mInChI is based on that.

    • DD – Is this needed for 0.3.0?

    • JC – I think MB should do this.

    • IP – Didn’t we decide not to include this csv?

    • JC – We decided to exclude targets with nonstandard conditions/weird pHs.

    • IP – Wouldn’t this delay the release?

    • JC – This will take 10 minutes.

    •  

update to milestones and deadlines - seeking approval

Additional topics

JW – have we jotted down STANDARDS for protein preparation? Stuff around protonation/capping/missing residues/etc

JC – documented in best practices for free energy calculations and best practices for benchmarking

  • I think those will evolve; best practices papers are intended to capture current understanding of what’s important

  • JW – one idea that was appealing is that we should have PDB files that are consummable by tleap, grompp, OpenMM.PDBFile

    • JC – basically, can we write a linter?

  • DD – To clarify: For OpenFF benchmarking we’re planning to use PLBenchmark; do you need a validation tool for other PDB files you want to shove through this system?

    • JW – this validation tool would be used for other benchmarking effort, such as just equilibrium MD of proteins to reproduce NMR observables

    • JC – is the goal just to establish whether a PDB is simulatable with a given engine + FF?

    • JW – will be using OpenMM for a start

    • JC – think OpenFE may be the best destination for this kind of validation

    • JW – recognize that this is an area where we have similar but not identical problems with OpenFE; think it’s best we focus on our needs here first and then see where we can collaborate on longer-term solution

 

 

 

 

 

Action items

Decisions