2021-11-18 Meeting notes

Date

Nov 18, 2021

Participants

  • @Chapin Cavender

  • @Jeffrey Wagner

  • @Pavan Behara

  • @Michael Gilson

  • @David Mobley

  • @Daniel Cole

Goals

  • Models to represent solvent in quantum chemistry calculations

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Solvent in QC

@Chapin Cavender

  • Goal

    • Derive charges for a fixed-charge force field appropriate for condensed phase

    • Include solvent model in quantum chemistry training data

  • Current approach

    • RESP with HF/6-31G* or AM1-BCC overpolarizes a gas phase wavefunction

  • Prior data

  • Hypothesis

    • Including solvent in QC calculations is necessary for modeling charged molecules with a fixed-charge force field, including nucleic acids and intrinsically disordered proteins

    • SB – Kinda a stray thought that I have here – I wonder how we’re framing this / what are the goalposts? What means are we looking at consdiering for what ends

    • CC – I’m thinking that this discussion is about methods to reproduce conformational properties. One tough thing here is that I don’t think we’ll start seeing the need for this until we start benchmarking on large, charge molecules, so we may not have a good dataset exhibiting this problem in the short run.

    • SB –

    • JW – We kinda scared the pharma partners into using small molecules to ensure that their QC jobs would finish quickly. So I doubt that the partner benchmark has large,charged molecules

    • SB – Apologies, I meant the protein benchmark paper/datasets

    • CC

    • DM – Do you have a definite example of the behavior you’re thinking of that we can reason with?

      • CC – I’m not super familiar with the problem. Something like ARG dipeptide might work. But generally I’d expect this to pop up for large, floppy molecules

      • DM – Could open this up to broader discussion on science twitter - “what’s the smallest molecule whos conformational preference is largely decided by charge?”

      •  

    • DC – My take on this is that it’s really important. It seems like, previously, we’d need to make a choice and stick with it. But now we can experiment with several methods.

    • PB – Are you looking for a dataset with charged molecules that are protein-like, or a dataset with implicit solvent dont?

      • CC – Mostly looking for datasets with multiple-charged molecules (+2, +3, …)

      • PB – Some of the recent openMM datasets have big charges, like -8

      • Set of charges: [-8.0, -4.0, -3.0, -2.0, -1.0, 0.0, 1.0, 2.0]

    • MG – A way to make this a research project could be to do something like what MSChauperl did, and mix the results of vacuum and implicit solvent charge assignment. Then we could look at whether there’s an optimum in the middle.

    • CC – I’d like to ensure that we can measure this performance on a subset of molecules, before we pay the whole computational cost of making multiple protein FFs.

    • MG – I could almost see an argument that one atom wont polarize the others to as great an extent in implicit solvent.

      • DM – We had an experience in working with compound series where we had to scale charges down to get agreement with experiment

      • MG – One really hard case would be a rigid molecule with large self-polarization, since then we can’t conformationally average things out.

      • JW – The implicit solvent calcs available to us are very slow, this will make it hard to generate these datasets.

      • SB – It seems like at least an order of magnitude slower

      • MG – How flexible is this limitation? Could we speed this up using our resources?

        • JW – Unfortunately, this seems unlikely. We haven’t been able to motivate the Psi4 devs with any incentives available to us.

      • CC – Could we use something like COSMO for faster implicit solvent?

        • DC – CRingrose looked into this. Gaussian09 PCM worked well and quickly. We also used Psi4 PCM model and we liked that as well. I don’t recall the timing but I’d expect Gaussian to have been much faster.

  • Decisions

    • Implicit solvent

    • Explicit IPolQ-style charges

      • Requires sampling solvent configurations with MD trajectory

      • Requires fitting mean field charges to solvent ESP

      • Supported and threaded in psi4

      • CC – Hidden “gotchas” are: this bakes in a FF water model, and the distribution of water molecules is dependent on the force field+conf.

      • DC – One question I have, which overlaps with CBayly’s previous comments, is that this would require several iterations.

      • CC – Yes

      • DM – We previously did an iterative approach like this, and found a lot of headaches - Stuff like QM needing a highly varied number of iterations, convergence failures. On the plus side, this does seem inherently sensical, and it preps the output for the environment we’ll be putting it in.

        •  

      • MG – I’m not super familiar with this, but it seems like a lot of work for questionable gain.

      • SB + MG – Are there formal benchmarks/arguments for of the iterative approach in the literature?

        • CC – There are benchmarks involving the iPolQ FF in the literature, but they have several other confounding factors going on at the same time (like refitting other parameters). Lillian Chong has some papers on this.

      • SB – So, to help me understand the method, we do a simulation and get a solvent configuration around the molecule. Then do we do a QM minimization…

        • CC – I think each solvent configuration gets minimized in MM, then is dispatched to QM ESP calc…. The goal is to get an accurate representation of the condensed phase minimum, but it’s not clear which steps in the process should be using which sorts of QM vs. MM with an without solvent.

      • MG – …

      • CC – I’d see this as an exploratory approach, though we do have a “null” model for charges that won’t blocked the initial results.

      • DM – How does this work with timeline? If we want to have a biopolymer FF MVP in one year, and then you’ll work on more faculty-focused projects.

      • CC – That’s true, though this can be open-ended. So if I don’t finish it, then someone else will be set up to continue a good study.

      • SB – I’d like to see more about how we’re going to evaluate the effect of the choice.

      • DM – We’d need a dataset of charged ligands, since those are most likely to have significant effects. Maybe host-guest stuff?

      • MG – Lots of confounding factors to watch out for - If the conformation preference is slight, then our studies may be hindered by, eg, errors from vdW parameters that are outside the realm of charge fitting.

      • CC –

      • MG –NMR data on charged peptides?

      • DM – Host-guest binding data may be useful here. Those have shown great improvement from using polarizable models.

      • JW – It can help to have BOTH redundancy to ensure project success, AND diversity of approaches so that we have a good dataset at the end.

      • MG + PB – Could use explicit solvent MD to generate conformers. That may bring in the best of all worlds. Then we could have some criteria for diversity like RMSD or ESP difference

        • PB – Peter Eastman just submitted a QM dataset of solvated amino acids, using explicit solvent simulations to get conformers, and the closest waters are included in the QM submissions.

        • CC – One tricky thing is that highly charged molecules order waters out to a large distance, like RNA polymers can order waters out to 30 A.

        • JW + MG – I think we could handle large water boxes for the MD part, we’d just need to be selecting about how we grab inner shell waters when submitting QM jobs.

  • CC – Conclusions seem to be

    • This is worth studying

    • We need an experimental dataset

    • The pilot study for iPolQ/iterative approaches can start whenever I have time, and doesn’t require special infrastructure tasks

    • (General) – Agree with this summary





  • Update on LiveCOMS article

    • Most of it is in, still waiting on a few coauthors

  • Status of ff creation

    • CC – Working on charge assignment using AM1BCC. Some problems with carboxylic acids due to OE’s ELF10 stuff.

      •  

      • SB – Manually flip H dihedrals

      • JW – I’ve tasked JMitchell with making an automatic-flipper this, though at low priority. so this may not get fixed in the next few weeks. Let me know if it’s needed sooner.

        • CC – This is waiting on the torsiondrive dataset, so there’s no huge rush.

        • JW – Ok, the fix for this is ETA about January at this pace. Let me know if it’s needed sooner.

    • Doign torsiondrives on phi and psi backbone dihedrals. Was blocked by some changes neded in QCF and QCSubmit, those are done now, I expect to submit soon



Action items

Decisions