/
2020-08-06 PLBenchmark update notes

2020-08-06 PLBenchmark update notes

Date

Aug 6, 2020

Participants

  • @Matt Thompson

  • @David Hahn

Goals

  •  

Discussion topics

Item

Notes

Item

Notes

 

Keep code and data separate?


Yes, but how?

  • If 2 repos, need to include some minimal test data in code repo

How to distribute data?

  • Can put them in release tarballs, as PDBs can get quite large

  • Used to push data to github, got too big and now storing on google drive

Current structure

  • targets.yml contains metadata, for each target

    • /data/: metadata for each ligand (edges.yml, ligands.yml, and protein.yaml)

    • /protein/ (PDB for structure, GROMACS TOP for topology, and FF file - a modified 99sb now, could be others/multiple in the future)

    • /ligands/, (similar structure for each, coordinates using SDF, force field parsley)

    • /hybrid/, defines which atoms are converted into others

    • not stored, could could also add things like combined systems, solvated boxes

Data needs

  • Avoid redundancy (i.e. only one coordinate file, convert on the fly as needed)

    • A single master ligand SDF would be useful for somebody looking through, but that would be redundant data

  • It would be nice to be able to easily access without using the command line (MT disagrees/is unsure)

  • Current directory structure is good for DH’s work, may not be the most accessible for users

  • Most important need here is getting ligand data out, with structures, since the force fields are accessible

    • Maybe topologies should not be provided, only the tools required to generate them?

      • Upside: Lighter data, better fit to the principle of this data

      • Downside: partial charges will be inconsistent, and probably something else will cause different topologies to be generated

  • MT: How significantly does data change regularly?

    • DH: Often, i.e. changing ligand or protein structure, adding in more t, even more targets, especially if user contributions in the future.

Code needs

DH: Could use a review on current code. Also would like to add some features to it

MT will review code

DH also built a workflow in his fork of pmx

Rename

PLBenchmarks needs a better name, John suggested Beryllium

Action items

Meet next week

@David Hahn will go through code (PR #3) (clean it up, provide some sample data in the repository, improve tests, adapt tests to sample data)
@Matt Thompson next week will provide feedback on PR
Next week Draft a plan for releasing this and future PL data

Decisions