2021-04-19 Protein-Ligand Benchmarks Meeting notes

Date

Apr 19, 2021

Participants

@David Dotson
@Lorenzo D'Amore
@David Hahn

Goals

Demo on existing PLBenchmarks workflow

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
PLBenchmarks demo	David H	DH: document on Confluence `targets.yml` is the index of targets used by the code for each target directory, there is a `ligands.yml` that includes experimental affinity data `target.yml` See this map: `00_data`, `01_protein/crd` and `02_ligands/<lig>/crd` must be defined; at the moment top files manually curated `PLBenchmarks/metadata.py` has some validation functions for checking internal consistency of index/top files with input data `03_hybrid` is generated by workflow If there are charges in the SDF file, workflow will use those charges DD: will consider whether we can start using `openff-system` for gromacs parameterization `workflow3_solvate.py` can generate many numbered replicates the outputs of these would be used for interfacing with FAH step 4 generates inputs for SGE, SLURM, ready for energy minimization step 5 submits to queuing system step 6 checks simulations for some basic issues step 7 analyzes results could be 30mins per target, about 10 seconds per replicate edge Existing results live in `benchmarkpl`: Random components initial velocities of atoms placement of solvent, ions DH: you will likely need to install `git-lfs` in order to clone the full PLBenchmarks repo datasets

Action items

@David Dotson will attempt to reproduce existing workflow on local infrastructure

@David Dotson will draft an initial design document for FAH workflow prototype; this will be the basis of iteration before investment in heavy implementation

@Lorenzo D'Amore will attempt to reproduce existing workflow on Janssen infrastructure, identify gaps in protocol or dataset issues

Meetings