2020-05-14 Force Field Release meeting notes

Date

May 14, 2020

Participants

  • @Hyesu Jang

  • @Simon Boothroyd

  • @Jessica Maat (Deactivated)

  • @David Mobley

  • @Jeffrey Wagner

  • @Owen Madin

  • @Christopher Bayly

  • @Lee-Ping Wang

  • @Matt Thompson

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

 

@Simon Boothroyd @Owen Madin

  • Discussion of outliers from the mixture study

  • SB – Ketones always decreased in accuracy after my fitting.

  • CB – Did you see this for pure ketone density?

  • SB – Yes

  • SB – Derivative of objective functions were showing that ketone properties were frequently in opposition to the derivatives of other properties

  • SB – We think it might be vdW parameters trying to compensate for issues in the charge model

  • CB – Some speculation – In all the mixtures that contribute to the first derivatives, are all the carbons ketones and ester carbons?

  • SB – There’s also some acids

  • CB – This is a good analysis. The pull in hyrogen-attached-to-sp3-carbon size is particularly interesting. In esters, the dipole will be oriented slightly toward carbonyl oxygen, but will kinda be bisecting the O-C-O angle. For ketones, it’ll be right on the C=O bond. Also, in ketones, the dipole interaction will have its positive end on the carbonyl carbon. Thus, dipole interactions will be occluded by the adjcent alkyl groups. So, maybe the carbon radius is too large, but anything with a hbond donor won’t have this issue.

  • CB-- Carbonyl oxygens in general don’t have an anisotropic charge surface. So maybe extra points will improve this dramatically.

  • SB – Is there something we can do to avoid building in fudge factors for this?

  • CB – We ARE building in artifacts, but we know that. But we have to exist within the constraints of our functional form. Until we have vsites, the big question is whether we should break out new atom types. I expect that the addition of amides will also have this sort of problem. For this problem, it seems like we either need to reduce the size of carbonyl carbons or oxygens.

  • LPW – I had a similar thought to Chris’s. I wonder if supporting Kantonen-style density-based vdW parameters could fix this.

  • CB – If density-based vdW calculations will work, then QM dimer energies should be an indicator of that.

  • JW – Density based vdW parameter calculation may be prototypes in the short term, but won’t be in main in OFFTK any time soon

  • CB – I wonder if Simon’s work could become something like Victoria’s benchmarking, where where we go and actively identify parameters which need special attention/splitting.

  • DM – DM + LPW – Agree that thesea re active areas of investigation. We should keep this in mind for upcoming work, maybe on the year-ish timescale.

  • DM – Do these oxygens have different BCCs in AM1-BCC?

  • CB – I don’t recall. This would be in the original dpaper.

  • SB – I will continue looking into this and get as much data as we can. I expect that the BCC refits will be a good opportunity to revisit this question. I talked with Schauperl and MKG about how to order these studies. So BCC fits first, and then vdW type changes .

  • CB – The graph indicating that the H-bount-to-a-carbon-adjacent-to-an-electronegative-element-wants-to-be-smaller might indicate that there should be a differentiation there. This is a contentios parameter in some AMBER FFs. SO this is aninteresting data-driven justification

  • LPW – So, for ketones, it’s trying to make rmin_half smaller, for everything else, it’s trying to make it bigger. Would the new parameter definition treat ketones differently from esters?

  • [CB – It would say that the methylene group attached to carbonyl carbon needs smaller hydrogens. It would say that the ester hydrogens would need a smaller radius.

(restatement for clairifcation)

CB -- AMBER hydrogen parameter IS be special for carbons attached to ester oxygen. But ketones don't get this differentiation. So maybe we need hydrogen-attached-to-a-carbon-attached-to-a-carbonyl-carbon.
SB -- Could split this H parameter and retry training.
DM -- I’ve gotta run for another call. Very interested in finding appropriate ways to test this out though, but not at the expense of stalling LJ refitting work when the path forward IS clear. So… I’d like to BOTH plunge ahead AND investigate this as time allows.



@Hyesu Jang

  • valence parameter fitting update

https://docs.google.com/presentation/d/1uaozzqk0MdQWELswwh4xUvZe74sCy382F0XvwxC_wAQ/edit?usp=sharing
  • DM – In general, we’ll want to use smaller molecules, but fragmentation infrastructure wasn’t ready

  • CB – Interesting that there are two factors: More iterations, and each iteration taking longer

    • Two implications:

      • If objective function landscape has changed (needing more iterations), then… something bayesian methods

      • If objective function has changed, maybe a smaller number of parameters would reduce the number of minima

  • LPW – Agree. This optimization is not only taking longer per cycle, but also requiring more cycles. This probably means that there’s something different about the surface. My hypothesis is that larger molecules deviate more from the original geometry when we do MM optimizations, and this may be confusing this.

  • JW – Large molecule – more sterics. IS there an experiment we can design to see if including large molecules imrpoved thigns?

  • CB – IF we’re going to look at large molecules, and there’s something pathological tha thappens with large molecules with an RMSD objective function, does this indicate that a different objective function is needed for large molecules?

  • CB – WRT to iterations taking longer, this is where FB could benefit from paralellizations

  • CB – Chemical diversity of dataset vs chem diversity in general – This is a super good point to identify.

  • CB – CS’s work indicates that including 30 heavy atoms may be important for capturing effects of larger molecules, specially when it comes to conjugation.

  • DM – The last point will be covered by Chaya’s automated fragmentation, which already considers WBO of bonds when fragmenting

  • LPW – WRT point 2, we already have this workflow parallelized. Batch size tries to ensure that workers will finish at the same time. What I suggested to Hyesu was to spread out the large molecules over a larger number of batches. So I don’t think that the architecture is limiting.

  • DM – Someone mentioned starting from multiple “starting point” force fields. Maybe we could start from all previous versions.

  • HJ – Yes, that’s something I’d like to try.

Slide 3 – Three moieties that we could eplicitly add training data for

  • DM – It would be good to ensure that in-ring torsions and out of ring torsions get different parameters

  • CB – My perspesctive is that endocyclic angles are messy, since ring effects influence them much more heavily than the local effects. So, when we train them together, we basically intend for them to be fit to EXOcyclic appearances, and hope that the ENDOcyclic appearances don’t screw it up too badly.

  • HJ – For torsion parameter that include both endo and exocyclic rotation, we only drove exocyclic torsions.

  • DM – Benchmarking on preliminary results would be useful.

  • CB – I wonder if we can analyze dataset to find parameters that only have ENDOcyclic appearances.

  • LPW – I agree that endocyclic torsons may not be useful for parameterizing our FFs, and I wonder if they’re affecting the parameters in a BAD way.

  • CB – If we’re lucky, the parameters determined by EXOcyclic torsion scans will end up

  • JW – Maybe we could take all propertorsions with 4 heavy atoms and make ring- and non-ring variants of each (s99F has very few ring-specific SMARTS in torsions)