Relevant stuff from BPS | @Chapin Cavender | CC – Phil Biggin lab curated some good datasets for cyclic peptides. Could be handy for our benchmarks. There are also some funky covalent modifications. CC – A lot of talks+posters involving/about alphafold. Consensus seemed to be that it does a good job on single chain proteins, some discussion about its utility on larger proteins and complexes. Some people would feed in smaller segments of proteins to get wider-ranging guesses that they could stitch together to get more guesses. Also it didn’t do well on point mutations or physical property predictions. CC – Talk from Gabriel Rocklin (?) to measure folding free energies of small peptides. Does this by covalently attaching DNA encoding protein to the protein itself. Then they digest the protein at different concentrations of protease and pull down the remaining protein, and sequence any DNA attached to it. The DNA pulled down only corresponds to the protein that WASN’T degraded. So by doing this under different conditions they can get the pathways/kinetics/energies of folding. This is a high-throughput method that allows for, for example, testing all single and double mutants. This could be handy for our benchmarking of folding free energies for small peptides. MG – Could be a great dataset. Would be neat to look at effects of synergy between double mutants. CC – Yeah, I’m expecting them to publish the dataset soon, and then other people could mine it.
CC – LW gave a nice talk on graph charges. Got good questions about things like “benefits of training to AM1BCC instead of directly to QM”, “what’s next/what will this enable?” CC – I did a poster presentation that went over pretty well, showed T4 lysozyme and toluene, a lot of folks had bounced off previously and weren’t sure if this was ready to try it again. CC – I used language that said we want to treat small mols and proteins self-consistently. This went over a little poorly with AMBER/CHARMM folks, they say their stuff is self consistent. So that may be a preview of what to expect from reviewers. MG – The goal is to say “it's self consistent” without saying that other FFs aren’t. MG + MS – Could be a good concrete point to show parameterizing lysine using GAFF vs. ff14 and see whether the energies/parameters are very different. Would expect vdW to be very similar. CC – I expect the vdW would be very similar, but torsions could be quite different. MG – Would be interesting to do a comparison of GAFF proteins vs. ff1X proteins. CC – I may have done this before quickly. I’ll post a link on Slack if I can find it. MT – I’d be a little concerned about downplaying self-consistent small molecule+polymer value to appease other communities. MS – Yeah, but we should have a backup slide handy that we can keep around for when people ask in talks. MT – Getting into an argument with AMBER may be a big time sink, it’s worth considering not engaging. MG – I’m genuinely interested in the answer MS – Agree MT – Seems likely it will become a lot of back-and-forth, turn into more than a couple hours. MS – This is useful for professors in a lot of contexts - There are a lot of conversations/situations I’ll be in where we can advocate for us as long as we have some data. DM – It’s worth noting that a lot of people criticised us for making a small moleucle force field that wasn’t consistent with protein FFs. So the protein FF is a response to that audience, where we say “ok, so we made a consistent protein FF“. At some point we can get to the position of being the major FF, and then people will be so busy using our stuff that we won’t have to deflect this sort of this. MG + MS – Yeah, but on a scientific level it is a good thing to know/have data to answer.
|
NMR benchmarks | @Chapin Cavender | CC – I’ve generated trajectories of 13 of the peptidfes that we have the most extensive NMR data for, 500ns sims. Ranging from 3-5 residues, longest being penta-alanine. Writing scripts to analyze and do scalar couplings. Will share those numbers/figures ASAP, expect soon/before grant is submitted. CC – I’ve tried to run larger proteins like GB3. Those are crashing/getting NaNs with both AMBER and SMIRNOFF FFs, but not immediately. I need to debug this. DM – Do hydrogens ahve nonzero LJ? MS – Are crashes immediate? JW – Happy to have a debugging session on this, let me know if that’s desired. CC – Reading OpenMM top into OpenFF… MT – So given an OpenMM topology with graph+elements… You’re converting into an OpenFF topology using the unique molecules path… Pathway is protein + FF → Interchange → OMM system → OpenMM Modeller → OpenMM Modeller+water box → OpenFF topology + FF → OpenFF interchange –> OpenMM System
CC – Main question for group is “once I get NMR data for peptides, will it be better for focus on the LiveCOMS preprint, or focus on the full protein trajectories?” MG – Which path is safest/most predictable? CC – LiveCOMS. MG – I might advocate for full proteins over preprint, since that’s… MS – It’ll be a good thing if we can run it and have it not exploding, that’s a good thing to show. MG – LiveCOMS will take a lot of editing, there are some other authors who aren’t taking responsibility for things that I think are their responsibilities. MS – If we can get the review submitted within weeks of grant submission, that’s good timing. MG – I don’t think I can guarantee that. I’ll have to prioritize some other stuff after the proposal is in. I also thing that there is a lot more revision/discussion to have with other authors. Some authors ahve been responsive and cooperative, others less so. I don’t think we can get the next draft out to coauthors tomorrow - It will require far more than a day of revision. There’s something to be said for just sending things to preprint early on. Chapin, how does that sound? CC – Heading to preprint next week? MG – I may be able to get through it this weekend, if that can be done, then we might be able to preprint next week. CC – Mar 3 is still target date? (Some discussion of delay to posting/issuance of DOI, whether to post directly to Zenodo to get an immediate DOI) DM – ChemRxiv is a bit variable, 1-3 days MS + MG – Rxiv is very predictable, has a published release schedule. Up to 72 hours, but if you time it well it can be near immediate. MS – BioRxiv would be preferred for this subject.
CC – So I’ll make chi^2 bar plots for scalar couplings on dataset from desres group. I’ll compare to ff14SB and our protein-specific FFs. Will one figure be appropriate? (bar plot for each FF, broken down by target (peptide) and the value of the calculated scalar coupling, which chi^2 on separate axis. Alternatively could do it as a table. Or just using 3 bars (one per FF)) CC – I’ll post drafts of these plots on slack.
|