Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Date

Participants

Goals

  • Consensus on input format for compounds to be optimized on the public QCArchive instance

    • ideally, this should be the same format used as input in the production benchmark on internal compounds

  • Update on software approach

  • Update on choice of method/basis for QM calculations

Discussion topics

Item

Presenter

Notes

Resources

David Dotson

  • DD Will set up a survey to look into the types of queues and resources available at the partners and send out this soon.

  • XL how hard is it to install psi4 as we need this in our cluster.

  • DD We have a lot of experience with this in our public archive runs, we have many production envs available and this should be very easy for everyone to setup. DD shows live example of how easy this is to setup.

Input format

Gary Tresadern

  • 7 parties interested in running calculations on public QCArchive

  • 1300-1400 compounds each, selected according to individual criteria (as discussed)

  • structures generally patented

  • GT: should it be a 3D SDF with single conformation, or multiple?

  • DH: 3D SDFs better because SMILES don’t have chirality information

  • XL: may choose things similar to what’s in pubchem, elsewhere

  • JH: if we don’t specify everything, we’ll be relying on rdkit to generate conformations, etc.

  • DH: we’re planning for conformer generation in the workflow

  • JH: could fill in the gap if not provided

  • GT: default is folks provide a single conformation at least, we generate/fill in up to 10.

  • GT: what about charged compounds?

    • going to need to know the bonding for charged molecules in particular

  • JH: if it is charged, definitely want charge specified; all the initial fits weren’t done on charged molecules

  • GT: think you will be able to handle them; question remains on the basis set

  • JH: if the charges are defined in the file, that would help

  • GT: need to determine which fields in the SD file we use

    • just id, charge?

  • GT: a thousand neutral molecules in 3d with hydrogen, then as part of the workflow charge them with rdkit

  • DD: the public QCArchive submission can be used as a test approach with high visibility; can decide today on a reasonable, perhaps minimal input spec and see where problems arise

  • GT: neutral 3d input, take a week to look for a reasonable open source ionization predictor

  • Conclusion: neutral 3D input, with hydrogens specified; deferred decision on protomer enumeration

    • we will pursue a open-source solution for protomer enumeration; if none, up to each partner if they want to do it, and what tools they use

method/basis

Gary Tresadern

  • 6-31+G** of interest, running a test set over next few weeks

  • GT: sticking to the same basis set is definitely important

  • DD: should we ask Hyesu to present basis set conclusions at next call?

    • sounds good, will ask if she’s willing and ready

Action items

  • Joshua Horton will do a research cycle on existing open-source protomer enumeration software options
  • David Dotson will reach out to Hyesu to schedule a presentation on basis set performance findings in ~3 weeks
  • David Dotson will prepare MM compute spec for PhAlkEthOH dataset for our benchmarking tooling evaluation and testing
  • David Dotson will schedule implementation coordination same time as this call each week through December
  • Gary Tresadern will communicate conclusion on input format: neutral 3D SDF with hydrogens specified; individual choices on selection criteria for compounds
  • David Dotson will complete software approach proposal; split out component work into issues on openff-benchmark to coordinate development

Decisions

  • No labels

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.