2020-08-11 PLBenchmarks Meeting notes

Date

Aug 11, 2020

Participants

  • @David Hahn

  • @Christopher Bayly

  • Gaetano Calabro

Goals

Discuss

  • Content of the dataset

  • Directory structure

  • File types

Discussion topics

Item

Notes

Item

Notes

Content

  • CB: valid benchmark set:

    • what’s the protein? → identity (EC number)

    • what are the ligands? → smiles

    • what are the activities?

    • everything else is an interpretation. (methods, ff, poses, charges, ….)

  • Stages

    1. above data

    2. + structures (PDB + poses)

    3. + partial charges, FF parameters

    4. method, method parameters

  • Ligands:

    • Structure as sdf file

      • coordinates

      • partial charges?

      • activity?

      • reference?

    • Charges (CB: if you want to evaluate e.g. another pose, we want to keep the charges constant)

  • Protein:

    • Structure as pdb file (protein.pdb + all other crystal molecules in <find_a_name>.pdb)

      • generated with gromacs gmx pdb2gmx

      • <find_a_name> : ‘water+other’, ‘water+cofactors’, ‘other’?

  • Hybrid:

    • hybrid struct based on ligand A

    • hybrid struct based on ligand B

  • Problems with current version:

    • partial charges

    • boxes (dimensions, number of mols)

    • position of waters and ions

File types

 

  • GC: very Gromacs centric

  • Is the sdf format lossless?

    • no, e.g. B-factor information are lost

Remaining questions

  • Is there a general format/presentation for chimeric molecules? (GC: look in FEsetup), CB: unique naming of atoms might be useful: OETriposAtomNames

  • Do we provide (Gromacs) topologies?

Action items

Decisions