2024-02-01 Force Field Release Meeting notes

Date

Feb 1, 2024

Participants

Ken Takaba
@Chapin Cavender
@Jeffrey Wagner
@Brent Westbrook
@John Chodera
@Lily Wang
@Pavan Behara

Slides:

Recording: https://drive.google.com/file/d/1PkqI4AamT-Kx3JVFf0LWMCUAEJERFhDR/view?usp=drive_link

Discussion topics

Item	Notes

Item	Notes
Nucleic acid datasets and parameterization (KT)	Slide 6: CC: what metric did you use for clustering? KT: don’t remember exact details, but it’s in GitHub README. Based on internal coordinates LW: what nonbonded terms? KT: openff 2.0 vdW, fitted my own charge model KT: challenges of validating RNA FF 10 us simulations look like they have the right behaviour, but if you keep running longer, you start seeing weird conformations. Not sure this is the current best approach JC: I think you’re providing the good for a good starting point CC: I’d use the same kind of observables in benchmarking proteins and NMR, so similar to what you’re doing now. There’s not as much crystal data for RNA as proteins. Would also look at bigger motifs, more than single-stranded, so looking at e.g. tetraloops or 10-mers. Have you looked at those yet? KT: not yet, since initial tests were failing KT: I compared J-coupling observables to experiment and they looked good, but other metrics don’t, so they may not be enough Slide 32: water models with AMBER LW: where did you get your conformers? KT: from database. Experimental datasets have to be stable so they’re stacked, which is biased towards anti JC: could we easily generate syn conformers ourselves to fill in the gaps (e.g. unfavourable areas)? e.g. with MD simulations? KT: that was one of the motivations for creating the nucleoside dataset with torsion scans, but need to handle hydroxyl sugar interactions carefully. I don’t know the right way to handle this at the moment. CC: my impression is a lot of these might come from NB interactions that aren’t calibrated properly, e.g. a too-favourable interaction between H-donor and PO4 group KT: may not apply to my smaller benchmark sets, but larger ones might have effect JC: could we bring in additional data that gives insight into balance between these interactions? Can help regularise LJ? Xtal-phase data could help with balance between interactions CC: not sure there are good solution-phase datasets for this KT: is OpenFF going to work on nucleosides in the future? LW + CC: yes CC: planning dataset of gas-phase data scanning torsions LW: is there consensus on adding implicit solvent? CC: not really consensus, sizable proportion including solvent considerations (implicit or explicit) CC: can also start with gas-phase calculations and re-tune against NMR KT: what I’m doing now is starting from espaloma 0.3 which is fit to QM data (including RNA) and re-tuning to 3J couplings JC: who is currently generating datasets? CC: nominally me JC: how does deposition of new datasets work now? Could MWieder help? JW: ML datasets live on different server as they’re too big for OpenFF. We have a Tuesday submission meeting KT: can anyone attend the Tuesday meetings? JW: depends on dataset size KT: is qca-dataset-submission still active? JW: yes JC: so things have now fragmented so there are multiple MolSSI QCFractal instances, where ML and OpenFF is separate and aren’t necessarily usable for the others JW: OpenMM is using the ML instance for the SPICE 2.0 dataset. Last time we tried to do a large datasets, MolSSI asked for some funding. JC: have things changed now that MolSSI’s circumstances have changed? JW: large datasets would need to be discussed by lead team JC: is OpenFF going to stick with small datasets generated at current level of theory? LW: can we still use datasets contributed to other instances like the ML instance? JW: yes we can, they just need CMILES JC: what would be helpful? Individual people owning their own datasets, or a dedicated extra person? JW: let’s discuss at governing board

Meetings

2024-02-01 Force Field Release Meeting notes

Date

Participants

Discussion topics

Action items

Decisions