2024-11-18 Westbrook/Wagner Check-in meeting notes

Participants

  • @Brent Westbrook (Unlicensed)

  • @Jeffrey Wagner

Discussion topics

Item

Notes

Item

Notes

General updates

  •  

  • JW

    • Optionally start attending alchemiscale meetings

      • BW – Sounds good. LW mentioned this as well. Happy to start going.

    • Organometallics handling

      • BW – Plan is to make a QM dataset of organometallics - Opts followed by singlepoints - then getting orbitals, dipoles, quadrupoles, some others.

      • JW – I either don’t know of- or there isn’t - a/the way to put transition metals into CMILES.

      • BW – This came up in planning meetings…. I don’t strictly need to mess with SMILES - The database I’m working with came as a cif file. The cif entries have fields called OESmiles, and I’ve been using those. Then I’ve been making mols from the coords+smiles (JW – atom matching?). I’ve been able to put some toy datasets together and created json.bz2 using QCSubmit but haven’t submitted any calcs yet.

        • JW: Gemmi to read CIF

        • JW – OESMiles are mapped?

        • BW – No, I’m ignoring conformers for now and just making mols from SMILES. We’ll want to use the actual conformers eventually.

        • BW – Was planning to interact directly with QCA/QCF code to work this set.

        • JW – So I think you have some freedom in how you want to store the mol graph on the QC entries/records. You could continue to use CMILES (With 3/10 confidence I think there’s disagreement in SMILES/inchi world about how to represent metals:

        • JW – Alternative to continuing use of CMILES would be to do something like explicitly enumerating bond orders and formal charges in record metadata. CMILES is probably technically simpler but you’ll need to pick and defend a convention for bonds to metals.

        • JW – So big thing is to define behavior and provide machinery that implements that behavior - At a minimum something like (input format) → QCA entry → OFFMol (or something else agreed upon). Once behavior is defined it’ll be easy to accept it into OFF ecosystem (OFFTK+QCsubmit), but I don’t want to have behavior in deployed software if we’ll later change it.

        • BW – Sounds good. What are constraints on solutions for current datasets and tools?

        • JW – Current toolchain could be RDKit+QCF, but the important thing would be that it helps us converge on desired behavior. Once behavior is defined we can implement in our ecosystem easily.

        • BW – Good sanity checks?

          • JW – Ensure that atom order is being preserved when mapping ordered conformer onto unordered SMILES

          • Roundtrip CMILES

          • Roundtrip though QC dataset format

          • Visual inspection

          • See which errors are coming out of any try/excepts and understand why.

        •  

  • BW – I cleaned up YDS issues with zenodo upload - Both using retries and adding a backup with time-limited github artifacts (14 days before auto-delete)

    • JW – That’s awesome, thanks for telling me about that.

    •  

Trello

https://trello.com/b/dzvFZnv4/infrastructure

Action items

Decisions