2020-04-16 QCFractal Meeting notes

Date

Apr 16, 2020

Participants

  • @Jeffrey Wagner

  •  

Discussion topics

Item

Notes

Item

Notes

Identify important tasks in queue

JM – Does this include optimization and hessian datasets

JM – OpenFF Gen 2 opt set 5 Bayer is highest priority, and then hessians after that

BJ – We don’t submit submit hessians until opts are mostly/completely complete(?)

HJ – Hessian calc for gen2 roche and coverage?

JM – Those should be complete

JM – Current grid opts can be put as low priority

JH + BP – Torsiondrives seem to be 90%+ complete, and the failed ones are probably serious issues that we want to discard

BP – Please let me know about which ones are OK to give u on (after the FF fitting push)

(General) Bayer opts is the major set remaining

HJ – Bayer set may have large molecules, up to 30 heavy atoms

Evaluate timeline for computations

JW will determine timeline and tell HJ and JM

JW – We’re OK to stop torsiondrives while optimizations run

DGS – It’s not as easy as hitting “pause”

How to do tagging

BP will ask Doaa to DOI-tag the relevant data

HJ + JM – Current fitting will use all datasets with name “Gen 2”

DGS – For “snapshots in time”, I recommend serializing the dataset to JSON or messagepack

Bulk download

Pull down collection, pull down results, and serialize

JH – I want to get snapshot w/o pointers

DGS – When you serialize and you don’t want pointers, to dataframe.to_dict

collection.data and collection.df – You don’t need .data, you need .df. If you do collection.df.to_dict, that’ll be a dictionary of all the fractal objects that are stored, in a dataframe.

JH – I want to get whole datasets, linking initial_molecule to final_molecule.

DGS – This would be something to talk to doaa about.

DGS – Caching might work. Build up a cache of 1000 molecules

Updates from MolSSI

 

Manager maintenance



User questions

 

Action items

Decisions