Approach III: training to FF data only

 

A valence fit to existing QM protein data only, experimenting with initial values from both AMBER and Sage.

Overview

Summary

One reason our protein force field candidates may not approach the performance of AMBER force fields may be because they were trained (over many years and generation) almost solely to protein data, as gaff is used to handle small molecules. Dataset issues on our end could include not enough protein data or overweighting small molecule data. This experiment trains only to protein FF data to see if that improves performance. We experiment with two starting points: one from AMBER initial values, the other from Sage.

GitHub repo/branch

Status

Not started In progress Completed Won't progress

 

 Milestones and metrics

Stage

Milestone/Benchmark

Contributors

Deadline

Status

Stage

Milestone/Benchmark

Contributors

Deadline

Status

Fit Sage from protein data only (Null FF)

Starting from an up-to-date version of the protein-param-fit FF ( ), run a re-fit only to protein data, from Sage initial values (MSM for angles/bonds, torsions for torsions).

@Chapin Cavender (updating environment)

@Brent Westbrook

Nov 22

Not started

Fit Sage from protein data only (Specific FF)

Starting from an up-to-date version of the protein-param-fit FF ( ), run a re-fit only to protein data, from Sage initial values (MSM for angles/bonds, Sage or AMBER torsions for torsions).

@Chapin Cavender (updating environment)

@Brent Westbrook

Nov 22

Not started

Validate AMBER port on all training and testing data

Check output OpenMM Simulation systems that all parameters are assigned the same as they would be under the OpenMM AMBER 14sb FF, vs the SMIRNOFF port

@Lily Wang

 

Not started

Fit AMBER from protein data only

Fit the above force field to protein data only

@Lily Wang

 

Not started

Small molecule benchmarks

Run small molecule QM benchmarks for Null + Specific

 

 

Not started

Protein stability benchmarks

Run GB3 benchmarks for Null, Specific, and retrained AMBER. Can likely decide whether to progress based on performance over 5 us

 

 

Not started

Helix folding benchmark

Run helix folding benchmark for Null, Specific, and retrained AMBER.

 

 

Not started

Smaller peptide benchmarks

Smaller peptide NMR scalar coupling benchmarks

 

 

Not started

 

 

 

 

 

Not started In progress Completed

Passed Failed

 

Progress and findings

Curated data (or similar title)

 

Action items

  • Ask Chapin to check protein-param-fit is up-to-date

  • Ask Chapin to prioritise null vs specific

  • Ask Chapin for input specific files (and double check the hard-coded params in b7s26)

  •