Transition Metal Forcefield Phase 1

Transition Metal Forcefield Phase 1


@Jennifer A Clark


@Lily Wang



Other stakeholders

Genetech, Chodera Lab


Provide Chodera lab and Genetech group with QM data necessary for training a machine learned forcefield.

Time frame

12/01/2024 - 12/01/2025

Key outcomes

A dataset that covers:

  • Relevant levels of theory

  • Support for metal centers of interest

  • Coverage of ligand chemical space

  • QM output with properties of interest

Key metrics

Forcefield goals

  • RMSE of relative conformational energies of DFT complexes

  • RMSD of energy minimized complexes at different spin states

Dataset Goals:

  • Provide opt and off-optimum geometries for primary targets with different spin states, sufficient to achieve forcefield goals.


In progress

GitHub repo

Slack channel


Designated meeting

TM FF Meeting

Released datasets



 Problem Statement and Objective



Must have:

  • Dataset at agreed upon model chemistry covering:

    • Metal centers of primary interest: Pd, Fe, Zn, Mg, Cu, Li

    • Single metal centers

    • Ligand chemical space with organic compound elements: C, H, P, S, O, N, F, Cl, Br

    • Charged molecules {+1, 0, −1}e

    • QM output with properties such as: energies, forces, partial charges, multipole moments, spin states

    • High-spin Fe complexes (e.g. up to 5/2)

    • Optimized and off-optimum structures

Must have:

  • Dataset at agreed upon model chemistry covering:

    • Metal centers of primary interest: Pd, Fe, Zn, Mg, Cu, Li

    • Single metal centers

    • Ligand chemical space with organic compound elements: C, H, P, S, O, N, F, Cl, Br

    • Charged molecules {+1, 0, −1}e

    • QM output with properties such as: energies, forces, partial charges, multipole moments, spin states

    • High-spin Fe complexes (e.g. up to 5/2)

    • Optimized and off-optimum structures

Nice to have:

  • Dataset with model chemistry overlapping with SPICE (i.e., ωB97M-D3BJ/def2-TZVPPD) and OpenFF (i.e., B3LYP-D3BJ/DZVP) standards.

  • Dataset covering chemical space of secondary interest (in order of importance):

    • 2. Bi- and Tri- metal centers

    • 3. Charged molecules {+3, +2, +1, 0, −1, −2, −3}e

    • 4. Metal centers of secondary interest: Rh, Ir, Pt, Ni, Cr, Ag
      Better: Make it element agnostic

    • 5. Transition states

  • QM output with properties such as: atomic spin density, orbital energies, electronic structure of complexes

Not in scope:


Project Approaches


Related content