2021-06-09 Bespoke Fitting meeting notes (sci)

Date

Jun 8, 2021

Participants

  • @Joshua Horton

  • @Daniel Cole

  • @Lee-Ping Wang

  • @Mateusz Bieniek

  • @Matt Thompson

  • @Chris Ringrose

  • @Simon Boothroyd

  • @Pavan Behara

  • @David Mobley

  • @Trevor Gokey

  • A Venkat (Cresset)

Goals

  •  

Discussion topics

Item

Notes

Item

Notes

B68 potential parameterization

  • Go through proposed plan for fitting Buckingham-6-8 type potentials with OpenFF / QUBEKit infrastructure.

 

DC: Shows slides to introduce the project. The target functional form is the Buckingham damped 6-8 potential for nonbonded interactions with normal bonded terms and electrostatics. We know that the dispersion terms in this form is more physical and introduces more parameters which should allow for more accuracy. This helps direct derivation from QM methods as we normally underestimate C6 parameters but that makes sense as we normally neglect the C8 term.
DC: We already have a proof of concept of how to plugin custom functional forms in the smirnoff format using smirnoff-plugins .
DC: JH has done some testing on this using the Buckingham 68 model on recreating the water properties using the published Rowley model, but we see an error with heat capacity which seems to be consistent between the b68 model and a normal tip3p_fb model.
LP: This may be due to the correction terms which are applied to account for the high frequency vibrations. We need to look into if the correction is being applied or is possibly wrong.
DC: We see that the smirnoff plugin seems to be working as we can recreate the Rowley work to within error bars apart from the heat capacity, so now this works what should we do with the model.
DC: My main idea was to fit a transferable hydrocarbon forcefield using this form as the intermolecular interactions will be dominated by vdW. Can we use evaluator to get a lot of properties to fit to?
DC: We also need some initial values for the b68 form which can come from literature or we can derive them from QM.
LP: I found the thermoML database evaluator draws from are missing a lot of older physical properties, so we may need to do some manual input.
SB: ThermoML might be okay for hydrocarbons but this should be easy to check against.

LP: I think that for hydrocarbons one of the challenges is going to be linear dependencies in the vdW parameters, as there is probably a sub set of space where parameters compensate for each other. We might get around this by being careful with data selection.

DC: restraining to initial QM parameters should also help us stay near the optimum values and avoid this.

DC: Parameter extraction, we can just extract the bond and angle terms from smirnoff or qube maybe with a small refit? Torsions will need a refit but we need to check if we need scaling or not.
LP: have we checked that there is no humps using the damped C6 and exp respulsion term.
DM: We should also check that there is no singalrity at 0 for FEP calculations so we do not have to use a softcore.
DC: The damping function by design should make the dispersion terms go to 0 but we need to check when combined with a lmbda term how fast it goes to zero.
DC: moving on to extracting the dispersion terms we have used the T-S method in the past but we can also use the XDM method to get the C6 and C8 and higher terms.
DC: The repulsive term due to electron density overlap, the b parameter is related to the electron decay rate. If you use an AIM method like DDEC you can get the atomic decay rate.
DC: A combines a lot of quantum effects together but we can derive it from QM using a scaling relation to the free atom volume using a similar method to the T-S method. This does leave some parameters that need to be fit. We can also maybe derive a set of A, b, C6 and C8 which could be transferable. One idea would be to fit a set of bespoke force fields and then look at how similar the parameters are.

LP: Bespoke vs transferable, its hard to define transferability but if we could show some level of transferability that would be fantastic, but this might be hard to do in a short time scale.

LP: I need to look at the Rowley paper but did it perform better? I think that if the water model did not use all of the degrees of freedom it might do better on solution properties as well. We should keep this in mind when optimising the model.

DC: Do we have any data that does not need a free energy.
SB: We have mixture ethalpies available which we can look at.

DC: We need some hydrocarbon physical property data, and look at the liturature for QM starting values and update QUBEKit.

Duplicate meeting notes, taken by JW

COPIED FROM https://openforcefield.atlassian.net/wiki/pages/createpage.action?spaceKey=IN&title=2021-06-09%20Bespoke%20workflow%20meeting%20notes

  • DC – Shows slides

    •  

  • DC – (Testing: Water Model slide) Not sure how to correct for constrained h bonds in heat capacity. Could use LPW advice

    • LPW – When classical models are used to calculate heat capacity, the results are always way off from experiment, especially with high frequency degrees of freedom. This is because of quantization effects. But whenwe use a rigid water model, a tiny amount is still thermally accessible. So the correction that we apply is that we take 7 experimental IR peaks of liquid water, and then we subtract the contribution from the FF, then add back the contribution from a quantum harmonic oscillator. In practice, this means adding a constant that depends on temperature. But those numbers are all stored in the text tables in forcebalance. So if you’re looking a tthe fluctuation in the instantanous enthalpy, you’ll need to add those numbers manually.

    • DC – In the middle column, I’m using forcebalance to do a one-off calculation

    • LPW – Oh, then that’s worth looking further into.

    • DC – I’ll send the code

  • DC – Could we use openff-evaluator to get a load of hydrocarbon liquid properties? I suspect the answer is yes.

    • LPW – One thing I found when I was working on Parsley is that the ThermoML database does not have a lot of older physical property measurements with simple molecules (like ethanol). So there may be some need to do manual inputting of literature data.

    • SB – Agree, thermoml lacks simple molecules. Though it does have refrigerants and long-chain alkanes. So it’s worth looking there to see whether it has suitable data for your use cases.

    • LPW – Re hydrocarbons: I haven’t tried to build a hydrocarbon FF before but what I’ve heard is that linear dependencies in vdW parameters can be fairly troublesome (C and H radii can move opposite to each other and give very similar results). So this may raise a question about temperature ranges and specific data. It may also help to have halogens on there as well to hold things stable.

    • DC – Also, maybe pinning to some QM-derived values would also help avoid this linear dependence.

  • On “Parameters from QM” slide

    • LPW – Re potential form – Are there some concerns that combining exponential decay with this form would lead to some undesired bumpiness in the energy landscape

    • DC – I’ll have to look at this.

    • DM – Chodera would also ask that we ensure that there are no singularities in this functional form, because otherwise it won’t be suitable for some free energy calcs

    • DC – This form should smoothly go to 0, though I could see this having trouble for other reasons in free energy calcs. I’ll see what this looks like with a bunch of lambdas

  • (Post slides)

    • LPW – On the last slide, my thought from the outside is that, if we want to think about bespoke vs. transferrable… transferability is often considered a “holy grail”, but it’s very hard to define what it is. If this was able to demonstrate transferability that would be great, but having said that, I don’t know if it’s possible to do that in 3 months. Though a limited example could be good.

    • DC – I don’t think we should feel constrained by “3 months”. CR has a lot of experience as a developer of qubekit, and JH will be around for a long time as well.

  • DM – Chris Rowley is fairly accessible, and is interested in OpenFF. The limiting factor with getting him involved was support for alternative functional forms. So if this is a good proof of concept, he’ll be interested in this as well.

  • JH – For C6 and C8 coefficients, Rowley has derived guesses for initial values for those from GAFF. So we might consider doing the same.

    • DC – Agree. That could be a good first approach, while we work on more detailed approaches.

  • DC – Is everything in place from the SMIRNOFF plugins side? SB and JH were seeing some potential issues with vsites and forcebalance

  • LPW – Do you think B68 keeps values tied more closely to QM?

    • DC – I think the paper was more of a proof-of-principle. They did radial distribution functions, but it doesn’t seem like they went far past what LPW did before

    • LPW-- Is there some flexibility in the model that wasn’t exercised in the water study that might be needed in solution properties. Like if you compare B68 to TIP4P, B68 has this extra parameter. If it then does equally as well in fitting experimental properties of water, then the extra parameter may be useful to fit something else in some systems. Do we have any idea what this parameter looks like or where it would start making a difference? Maybe with ion interactions or other solution properties? This could be an interesting avenue of research.

    • DC – That is interesting. Are there other solution properties that don’t require free energies?

    • SB – Mixture enthalpies are available. These are available in the Sage training set.

    • DM – We’re enthusiastic about mixture data, since it doesn’t require gas-condensed phase transfer, so we don’t have to worry about solvation effects in charges/interactions.

  • DC – So, for next steps, it’s fairly clear what we need to do

    • Getting a liquid property database for hydrocarbons

    • Getting QM data (initial guesses?) from literature and figuring out a general way to derive them.

    • Fixing heat capacity issue

  • DM – May be good to look at Phil Huenenberger’s work, on chlorohydrocarbons.

    • SB – I beleive that all of PH’s data in in ThermoML



Action items

Decisions