Chapin - 1001 Ways to Unfold a Protein
If you draw a black box around my postdoctoral work, Iāve spent three years turning several hundred liters of coffee into several hundred trajectories of proteins unfolding. Sometimes researchers do this on purpose because they want to understand the fascinating physics that drives protein folding. But in my case, each trajectory represents another piece of evidence that finding a single set of force field parameters that models both proteins and drug-like small molecules is, well, a hard problem. What makes this such a difficult endeavor?
Ā
First, we need to understand what we expect force fields to do and why they work. Molecular simulations are fundamentally techniques to take averages over the configurations of molecular systems. More precisely, simulations approximate integrals from statistical mechanics that provide useful connections to macroscopic, experimental observations but are so horribly complex thatāif you wrote them out fully on a chalk boardātheir eldritch impenetrability would twist your mind to madness. The accuracy of your molecular simulation depends on how many configurations you use to take your average (the sampling problem) and how well your model for estimating the favorability of each configuration reproduces the underlying physics (the scoring problem). Various methods for carrying out molecular simulations have a tradeoff between speed (which lets you solve the sampling problem) and accuracy (which lets you solve the scoring problem). Force fields arenāt as accurate as electronic structure methods like DFT for scoring the favorability of a single configuration, but they are fast enough to evaluate that you can accumulate many configurations from your system of interest. If you take your average over enough configurations, the errors per configuration tend to cancel out, and the averages estimated by force fields can do a remarkably good job at reproducing experiments on chemical systems.
Ā
[Graphic: graphical abstract from https://pubs.acs.org/doi/10.1021/jz401931f .
Caption: The volume of 200 g water according to DFT, tap water, and MP2. Simple molecular mechanics models can get this right, but DFT struggles. Even MP2 water has a less accurate density than OPC3.]
Ā
Open Force Fieldās parameter sets, Parsley and Sage, do a good job at reproducing experiments on drug-like small molecules. What are the challenges in doing the same for proteins? Drug-like small molecules are small and rigid. They have relatively few degrees of freedom, most degrees of freedom are relatively rigid with only one to three accessible minima, and most non-rigid degrees of freedom are relatively independent of each other. These features make it easy to explore their configuration space and take good averages over it, letting the error cancellation that drives force field accuracy to work its magic.
Ā
[Graphic: the Sage herb logo wearing a wizard hat waves a staff, sending blue lightning bolts at a Lewis structure of penicillin.
Caption: Sage casts cancellation of error at a small molecule drug.]
Ā
Proteins, like other polymers, are large and floppy. They have many degrees of freedom (two backbone dihedrals plus two-ish sidechain dihedrals per residue), many of those degrees of freedom are highly flexible with several accessible minima (the arginine sidechain has 60 rotamers), and these flexible minima are often correlated with each other (like in Ramachandran maps and sidechain rotamers). These features make it difficult to explore the configuration space of anything longer than a short peptide of a few amino acids and obtain enough samples for error cancellation to kick in during the averaging process.
Ā
[Graphic: the Sage herb logo wearing a wizard hat waves a staff, sending sad puffs of smoke at a ribbon diagram of GB3.
Caption: Sage needs more mana to cast this spell.]
Ā
A nice scientific feature that OpenFF software enables is reproducibility. We can propose a change to our training workflow, fit a new parameter set that targets the same training data and uses the same optimization protocol as a previous parameter set, and then benchmark it against the same validation data as the previous parameter set. This setup allows us to rigorously test whether the workflow change was helpful, and for Sage-style valence fits this process only takes about a week to go from idea to decision. When we add proteins into the mix, everything about this gets worse. Parameter fits take longer and need more memory, and benchmarks take an order of magnitude longer. Itās much more difficult for us to iterate quickly on ideas.
[Graphic: A still from the Pixar short āFor the Birdsā showing several birds perched on a telephone line. One bird, comically larger than the others, adds so much weight that the other birds fall into it. The large bird is labeled āProteinsā, and the small birds are labeled āSmall moleculesā.]
Ā
Other molecular mechanics force field families (such as Amber, CHARMM, GROMOS, and OPLS) can successfully model protein structures and are widely used to study protein biology and design ligands targeting proteins. Why canāt we just copy what those force fields do? General drug-like small molecule force fields from these families model proteins poorly for the reasons already enumerated, and so these force field families generate parameters that can model proteins by specializing. Separate parameter sets, differentiated by protein-specific atom types and sometimes protein-specific functional forms, are trained. Modern protein force fields like those just listed work well because of error cancellation, and fixing some errors makes overall performance worse. Other FFs specialized for proteins by targeting different training data and splitting parameter types. Weāre trying to minimize that inconsistency.