Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Date

Participants

Discussion topics

Item

Notes

Pharma partner/Roche benchmarking

  • DM – Xavier Lucas (Roche) wants to run local ANI jobs

  • JH – Should be possible. Will meet with DD and XL soon

  • JW – JH, are you interested in being involved to the tune of up to 5 hours/week with industry benchmakring paper?

    • JH – Yes – Seems to overlap with my goals, and I could push the bespoke workflow as well.

  • HX – What is scope of benchmarking?

    • DM – Similar to preprint on chemrxiv. Different pharma partners would do the same study on internal datasets.

    • https://chemrxiv.org/articles/preprint/Benchmark_Assessment_of_Molecular_Geometries_and_Energies_from_Small_Molecule_Force_Fields/12551867

    • HX – Will read. What’s the scope of test set?

    • DM – 2000ish molecules, 20,000ish geometries

    • HX – We’re doing something similar with CNO compounds. Chemistry is a bit more constrained than paper set.

    • DM – We could use more simple molecules for our fitting, if you’re willing to share.

    • HX – Our dataset is focused on small fragments. Allows simpler enumeration of hierarchical torsions. Would love to check out your molecule set. Ours et has fewer molecules.

      • OpenFF molecule set:

    • HX – When doing torsion drives, how are structures propagated along the scan?

      • JH – TorsionDrive method/wavefront propagation. https://chemistry.ucdavis.edu/news/driving-torsion-scans-wavefront-propagation

      • HX – Is there an initial MM minimization step before running QM?

      • JH – No, we take starting structures directly from RDKit/OpenEye.

      • HX – Is that done for each structure in the scan point?

      • JH – No, only the first scan point is done using a cheinformatics-derived starting structure. After that, the scan points are propagated from the QM-minimized structure of a neighbor.

      • JW – For generating starting structures, we’re making “OpenFF conformer generation” as the first tool in our CLI

      • JA – Could we do semiempirical initial minimization to get a fast, good starting point for more details QM?

      • DM – Somewhat concerned that we’d be led into local minima

      • JH – Unsure whether this would be a systematic problem.

General Check in

  • JH – Polishing up bespoke workflow.

    • Currently can submit molecule from commandline, will trigger ANI scans, unsure about output spec.

    • Need to optimize FB settings – Some outputs seem wonky.

    • Fragmentation/graph matching still has some bugs.

    • Large dependence on initial conformer method (OE vs. RDK) – How to control for this?

      • DM – Ask Relay about how they handle this? They’re big RDK users.

      • JH – Would like to check whether full QM shows the same problem as ANI.

      • JW – Start recording these cases in an issue on bespokefit so that we can squarely attack this problem later.

    • DM – Ready for beta testers? Lots of pharma folks would be interested.

      • JW – Should we meet about the spec for the minimum viable product?

      • JH – Yes

      • (General) – JH should be involved in spec discussion for atom map refactor/implementaion

  • JW – Atom map spec?

    • JW – System object development will reuqire more extensive atom mapping

    • JH – Don’t really store atom map in molecule, store it nearby in a separate dictionary. Currently lose track of atom mapping during fragmentation. Would be preferable to keep track of this.

      • (General) – Broad adoption of bespoke workflow will require removal of fragmenter OE dependecny. But in short term, fragmentation isn’t required.

        • For fragmenter backend replacement – JACS benchmark set has good diversity.

Galileo meeting prep

Give example of bespokefit?

  • No easily-deployable conda package built, so let’s skip this.

QCSubmit demo

PB’s current understanding

  • Submissions begin as QCA dataset submission PR

  • DD has a workflow that prepares and submits those jobs.

JH – The problem that QCSubmit solves

  • Helps build dataset for submission. QCArchive is new technology and there were no input format converters, and ambiguity about standards.

  • QM representations of molecules lose track of graph. This led to ambiguous interpretation of output molecules when we need to recover their graph.

  • QCSubmit objects are heavily-validated python objects. Includes logic for avoiding doing torsion scans on, eg, linear torsions.

  • Torsions are the most important thing to get right when looking at a molecule’s energy surface.

  • Users have inputs in a variety of formats.

    • If they have SMILES, then we need to generate “good” input coordinates.

    • If they have other 3D formats, the conversion pathway to QCSpec input is different.

  • Install instructions and source code for QCSubmit

  • Initial tasks for Pavan

    • Reproduce previous submission – OpenFF biaryl set

Action items

  •  

Decisions

  • No labels

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.