2021-04-19 Protein-Ligand Benchmarks Meeting notes

Date

Apr 19, 2021

Participants

  • @David Dotson

  • @Lorenzo D'Amore

  • @David Hahn

Goals

  • Demo on existing PLBenchmarks workflow

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

PLBenchmarks demo

David H

  • DH: document on Confluence

  • targets.yml is the index of targets used by the code

    • for each target directory, there is a ligands.yml that includes experimental affinity data

    • target.yml

    • See this map:

    • 00_data, 01_protein/crd and 02_ligands/<lig>/crd must be defined; at the moment top files manually curated

  • PLBenchmarks/metadata.py has some validation functions for checking internal consistency of index/top files with input data

  • 03_hybrid is generated by workflow

  • If there are charges in the SDF file, workflow will use those charges

  • DD: will consider whether we can start using openff-system for gromacs parameterization

  • workflow3_solvate.py can generate many numbered replicates

    • the outputs of these would be used for interfacing with FAH

  • step 4 generates inputs for SGE, SLURM, ready for energy minimization

  • step 5 submits to queuing system

  • step 6 checks simulations for some basic issues

  • step 7 analyzes results

    • could be 30mins per target, about 10 seconds per replicate edge

  • Existing results live in benchmarkpl:

  • Random components

    • initial velocities of atoms

    • placement of solvent, ions

  • DH: you will likely need to install git-lfs in order to clone the full PLBenchmarks repo datasets

Action items

@David Dotson will attempt to reproduce existing workflow on local infrastructure
@David Dotson will draft an initial design document for FAH workflow prototype; this will be the basis of iteration before investment in heavy implementation
@Lorenzo D'Amore will attempt to reproduce existing workflow on Janssen infrastructure, identify gaps in protocol or dataset issues

Decisions