Dipeptide 2-D TorsionDrives | @Chapin Cavender | 2D torsion-drive on 2 proper torsions Current status: constraint not propagated on QCSubmit jobs due to version error – resubmitting Alanine: higher levels of theory not considered, for consistency with other calculations (currently: b3lyp-d3bj/dzvp?) low energy basins of QM energy map at gauche conformations, likely controlled by sterics than other secondary structures so far
Tryptophan rotamers peaks and minima look like they’re in the same places, but there are differences in the rotamers, e.g. in the beta region DC: pointing out apparent jumps in energy in the QM map, asking if it’s hysteresis MG: side-chain of rotamer is not restrained, so it’s possible the apparent jumps are results from side-chains moving around. Future drives will be constrained CC: yeah, tryptophan has a few degrees of freedom near the backbone and a bulky aromatic group at the end
Proline rotamers CC: Only sampled up to ~60 degrees phi angle, as ring strain makes it difficult to go higher, as shown by the high energy barriers on either side of the basin CC: rotamers are fairly similar, differences only in barrier heights
MM torsion profiles compared to offset QM profiles (openFF 2.0) minimized molecule with torsion atoms frozen (sidechains are not restrained) Alanine – similar to QM Tryptophan – barrier heights different, region is a lot more “flat” Proline likewise looks similar
CC: Fits to target for Fourier series (which would be target in ForceBalance) SB: how does this differ from FB loss function? CC: This is essentially going to be the FB loss function, except without weighting factors CC: basically taking away the rest of the force field and only working with the torsions
Concerns about minimisation of MM conformers SB: might still have contributions from, e.g. angle gradients, more work might be needed to zero out the other terms MG: wonders about goodness of other terms, as angle terms will also contribute to the stiffness of a torsion DM: yes, but as a first goal, we are trying for a first pass set of torsions that works with other terms SB: raises concerns about side chain confounding the torsion fit, making it look worse, during MM minimization DM: how do other force fields deal with this? e.g. AMBER CC: AMBER doesn’t let the MM structure minimize, they use the QM conformation DM: Chris Bayly has pointed out that without optimization, you can end up with stiffer barriers from steric contribution SB: suggests restraints on most internal degrees of freedom, wonders if you can track the force contributed by restraints CC: other terms in sage/parsley were derived with minimisation
Results of MM targets CC: alanine looks periodic in 2D Comparing tryptophan rotamers CC: proline looks about periodic
Comparing torsion profiles by shape and magnitude RMSE and normalized RMSE from superimposed profiles CC: alanine and tryptophan more similar than anything to proline CC: In general, comparing between side-chains/residues is more dissimilar than comparing between rotamers MG: ignoring proline, because proline is always weird, differences between side-chains look similar to differences between rotamers. Low pro rotamer difference is understandable b/c it’s pretty rigid CC: included tryptophan because expected the biggest difference between rotamers of this particular AA DM: what’s the big picture? CC: know there’s coupling between side-chain and backbone dihedrals. This is a question about dataset generation – do we need to enumerate rotamers? existing protocols only use one CC: looking at normalized RMSE, probably need at least 2 rotamers each for a useful fitting target. Differences from ala and trp to pro are ~20% MG: interested to see what happens when side-chains are restrained MG: should we have side-chain dependent BB torsions one day? MS: that sounds like basically CMAPs
CC: tryptophan rotamers similar in most places, differ mostly in angles around linear
MS: what are next steps? CC: now need to scale up generation of QM datasets. This was a pilot feasibility study CC: resubmit with constrained side-chains, then decide if want to include other rotamers MS: can these torsions be applied to other small molecules that have the same chemical environments? CC: probably. An open question is if we want to make these stereospecific – do we want to give people the amide generic torsion or the protein-specific one? MG: well, if it’s a mirror image protein it should behave the same MS: differences between DDD and DLD stereo protein chains should be from sterics CC: agrees with proposal to not write in specific stereochemistry MS: cyclic peptides, instead of being treated as small molecule, should be treated as proteins DM: also agrees, no chiral smarts
Practical considerations MG: how long do you think this will take? CC: it’ll probably go faster with the restraints. We’re getting about 2000 optimizations per day. About 600 grid points. We want 26 side-chains, estimates ~50 days DM: suggests more compute resources. How soon do we want this? Helps with juggling free vs paid compute CC: next week or so DM: suggests CC ping internal after meeting and ask to spin up more compute, possibly enlist Trevor Gokey
SB: this is all training data, right? What are plans for benchmarking CC: looking at NMR observables for small peptides, which we have from LiveComs review, and work out which are most helpful for us CC: will reach out to SB in January to start working on infrastructure needs with e.g. evaluator SB: also need to work with the software scientists on this, will be a huge need CC: current plan is to write input files for external program and get external software to run it SB: also need to consider packaging and distribution, that might also need software scientist time CC: simple plan is use shiftx for chemical shifts, think about how to improve on that later, but that’s an accepted standard MS: suggests ML predictions of chemical shifts later, but for now shiftX as a benchmark. Mentions Andrew White as an interesting alternative CC: Yes, shiftX will be easier to compare with existing benchmarks for now
|