2024-02-01 Force Field Release Meeting notes

 Date

Feb 1, 2024

 Participants

  • Ken Takaba

  • @Chapin Cavender

  • @Jeffrey Wagner

  • @Brent Westbrook

  • @John Chodera

  • @Lily Wang

  • @Pavan Behara

Slides:

Recording: https://drive.google.com/file/d/1PkqI4AamT-Kx3JVFf0LWMCUAEJERFhDR/view?usp=drive_link

 Discussion topics

Item

Notes

Item

Notes

Nucleic acid datasets and parameterization (KT)

  • Slide 6:

    • CC: what metric did you use for clustering?

      • KT: don’t remember exact details, but it’s in GitHub README. Based on internal coordinates

    • LW: what nonbonded terms?

      • KT: openff 2.0 vdW, fitted my own charge model

      • KT: challenges of validating RNA FF

        • 10 us simulations look like they have the right behaviour, but if you keep running longer, you start seeing weird conformations. Not sure this is the current best approach

        • JC: I think you’re providing the good for a good starting point

        • CC: I’d use the same kind of observables in benchmarking proteins and NMR, so similar to what you’re doing now. There’s not as much crystal data for RNA as proteins. Would also look at bigger motifs, more than single-stranded, so looking at e.g. tetraloops or 10-mers. Have you looked at those yet?

        • KT: not yet, since initial tests were failing

      • KT: I compared J-coupling observables to experiment and they looked good, but other metrics don’t, so they may not be enough

    • Slide 32: water models with AMBER

  • LW: where did you get your conformers?

    • KT: from database. Experimental datasets have to be stable so they’re stacked, which is biased towards anti

    • JC: could we easily generate syn conformers ourselves to fill in the gaps (e.g. unfavourable areas)? e.g. with MD simulations?

    • KT: that was one of the motivations for creating the nucleoside dataset with torsion scans, but need to handle hydroxyl sugar interactions carefully. I don’t know the right way to handle this at the moment.

    • CC: my impression is a lot of these might come from NB interactions that aren’t calibrated properly, e.g. a too-favourable interaction between H-donor and PO4 group

      • KT: may not apply to my smaller benchmark sets, but larger ones might have effect

    • JC: could we bring in additional data that gives insight into balance between these interactions? Can help regularise LJ? Xtal-phase data could help with balance between interactions

      • CC: not sure there are good solution-phase datasets for this

  • KT: is OpenFF going to work on nucleosides in the future?

    • LW + CC: yes

    • CC: planning dataset of gas-phase data scanning torsions

  • LW: is there consensus on adding implicit solvent?

    • CC: not really consensus, sizable proportion including solvent considerations (implicit or explicit)

    • CC: can also start with gas-phase calculations and re-tune against NMR

    • KT: what I’m doing now is starting from espaloma 0.3 which is fit to QM data (including RNA) and re-tuning to 3J couplings

  • JC: who is currently generating datasets?

    • CC: nominally me

    • JC: how does deposition of new datasets work now? Could MWieder help?

    • JW: ML datasets live on different server as they’re too big for OpenFF. We have a Tuesday submission meeting

    • KT: can anyone attend the Tuesday meetings?

    • JW: depends on dataset size

    • KT: is qca-dataset-submission still active?

    • JW: yes

  • JC: so things have now fragmented so there are multiple MolSSI QCFractal instances, where ML and OpenFF is separate and aren’t necessarily usable for the others

    • JW: OpenMM is using the ML instance for the SPICE 2.0 dataset. Last time we tried to do a large datasets, MolSSI asked for some funding.

    • JC: have things changed now that MolSSI’s circumstances have changed?

    • JW: large datasets would need to be discussed by lead team

    • JC: is OpenFF going to stick with small datasets generated at current level of theory?

  • LW: can we still use datasets contributed to other instances like the ML instance?

    • JW: yes we can, they just need CMILES

  • JC: what would be helpful? Individual people owning their own datasets, or a dedicated extra person?

    • JW: let’s discuss at governing board

 

 

 

 

 

 

 

 

 

 

 Action items

 Decisions