Anticipated Rosemary infrastructure | Chapin Cavender | Infrastructure team is updating their roadmap for the next year. Do we anticipate any infrastructure needs for the biopolymer force field beyond the minimum viable product that we should have on the roadmap? Support for QM/MM calculations in QCArchive/QCSubmit Explicit representation of solvent sampled from MD trajectory, like IPolQ Solvate di/tri/tetrapeptides in specific water model (and possibly salt ions), run MD until solvent ESP converges, fit small number of point charges to reproduce solvent ESP QM solute with MM from fit point charges
Sample snapshots from a protein trajectory obtained with candidate force field parameters for subsequent retraining Choose snapshots by clustering or filling gaps in phase space absent from existing training data QM for subset of residues, MM for rest of protein and/or solvent JW - is MM contribution primarily electrostatics
MG – Idea is analagous to LWang’s experiment a few weeks ago, looking at changes in peptide charge as a funciton of flanking residues SB – What sorts of terms would be adding errors form gas-phase calcs? Would this approach really solve them? SB – This could be handy, but it’s not clear that this is going to be needed/high priority. Would it be better to start with something like RESP2? DM – We’re talking about two infrastructure needs: The “iPolQ approach” – Not a huge lift, we’ll eventually want to support this The “QM/MM of a packed/folded protein approach” – Not clear that we’d end up using this.
The former may be the higher cost/benefit ratio JW – Both of these may be the same/largely overlapping in infra needs. I’ll put these as candidates for addition to the roadmap, but don’t really have much of an idea about the difficulty. I’ll begin working with Dotson on seeing how difficult this could be MG – This could also be remedied by polarizability.
SB – I’d like to make sure that there’s a good scientific rationale and a driver before we add this. So it’s be good to have done a feasibiliy study and a plan to make the small molecule+protein FF self-consistent. DM – A weekish ago, we were discussing external electric fields in psi4 – What was the context for that/does that overlap with this? https://openforcefieldgroup.slack.com/archives/CJQ4DCWN8/p1632352959026000) MG – This was Willa Wang’s project. It’s being used to generate training data for polarizability. It came up because we had been doing calcs outside of the global QCA, and we dicided it would be better to do it inside the global QCA. WWang mentioned that there may be inconsistencies in how wavefunctions get mapped to ESPs. DM – Ok, so it’d be good to have a clear idea of what’s already available and a specific plan of what we’ll additionally need. CC – Big thing to me would be running MD and getting solvent distributions, then submitting those for calculation. SB – That sounds like something where we should do the sampling locally to begin with, then learn exactly which behavior we want, and then once that’s established, to try and find a place where it should be refactored and live in the longer run. CC – That makes sense. I’ll start scouting this out.
CC – I had thought think JW – Would people want a “point at a residue and get it out as a capped molecule” functionality? This is on our roadmap but I’m not sure how high-priority it is. CC – I could see that being handy for setting up QM/MM calculations. But for now I’m building things from the bottom up. DM – LW, have you made this before? LW – Not yet. Currently I do this manually. MG – What’s this needed for? DM – This would be for parameter fitting/charge assignment to new polymer units. LW – Oh, I do have a parameter/charge generator for new polymer subunits, but it’s kinda manual to say how and where to cap. DM – So I see this as three problems: How do I break this thing into repeating units? How do I cleave out a single unit? How do I cap this thing that I cleaved out?
JW – I could see this being used for mostly standard proteins with PTMs, where the PTM would get excised, get charges assigned, and then those would be assigned back to the subunit in the protein. LW – Currently polymetereizer tries to do this, but the cap addition could be more sophisticated. CC – So it’s hard to infer what an appropriate cap would be? Like, this is where user input is required? LW – Yes DM – So an algorithm could be “figure out if this is a protein, if so then do ACE/NME, otherwise don’t / just use methyl”. Instead of the “is this a protein” check, fragmentation could be based on WBOs, where the relevant neighboring environment could be included in the excised fragment. LW – Where would that go? JW – I think this should live in its own repo until we understand the desired behavior.
SB – So the root of this discussion is “How do we assign charges to an unexpected thing in an otherwise standard residue chain”? SB – Looking at the question of “how do we assign charges?” One option is “cut and cap”, but another option is to have a quick neural network which has been trained both on small molecules and proteins run on the entire protein. The featurization of this network would dictate how many bonds out it looks, and so the resulting parameters would be self-consistent. I think this should be considered alongside the “cut and cap” options. SB – Beyond charges, the parameters themselves would come from a single FF – We shouldn’t think of it as “small molecule parameters” mixing into the “protein force field”. It’s going to be a single self-consistent force field.
DM – So, LW should make the automated cut-and-cap method available. SB - benchmarking infrastructure Comparing to observables, NMR, xtal Pair distribution functions
|
Decisions for protein library charges | Chapin Cavender | ELF10 library charges for amino acids from Lily Wang obtained by averaging over Ace-Val-X-Y-Z-Val-Nme What residues should have library charges? What SMIRKS strings should we use? How do we handle non-integer averaged charges? SB – Infrastructure needs for benchmarking? CC – I’m thinking these will be simple – Just need to calculate kirkwood-buff integrals, pair distribution functions, NMR/xtal observables. JW – Could you send me a slightly more detailed version of this to ensure that we can get these on the roadmap PB – Would these go in evaluator?
|