Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Goals

Discussion topics

  • Slides:

  • Item

    Presenter

    Notes

    I had a couple of short updates, one about redundant parameters in Sage and one about conformera with high energy sterics.

    Lexie

    View file
    name2023_07_19_fffitting.pptx

  • TG – 6X2 shouldn’t really exist

  • CBy – argues for having generic parameters

    • Cby – there’s a strong history of pulling parameters out of the air. We could have a generic parameter in there as a stop gap measure, that we’ve trained to fit to all the training data

    • CBy – could run FF through huge database like eMolecules (stuff like enamine is too realistic) to evaluate coverage. If nothing matches, we could have a case for getting rid of t123 because we have nothing that matches it

    • PB – we do have some generic parameters

    • CBy – we don’t want to have a Jenga stack of a FF

    • PB – there is a molecule in PubChem that’s quite similar (although t123 wouldn’t actually apply due to double bond)

    • CBy – could we have routine test on some large dataset to test parameter coverage

    • PB – yes, did suggest something like ChEMBL30

    • CBy – could look at all parameters that have low population

    • PB – in 2.1.0 I added in parameters to cover rare chemistries in Chembl and pubchem

  • TG – we can go through parameters and sanitise things, e.g. torsion with N connected to two things.

    • PB – also, could start with k value of 0.1 and fitting the force field from there, so we don’t get ridiculous force constants

      • PB – could be equivalent of MSM for torsion parameters. Currently starting from previous fits for initial values

      • Would need something like simulated annealing for proper sampling

    • TG – did you make t123a PB?

    • PB – yes

    • TG/PB – changed periodicity from 1 to 3 in changing from t123 to t123a

  • (conformer sterics)

  • CC – are you including the 1-4 LJ interactions?

    • General – this is probably included by default

    • PB – @Chapin Cavender Force groups in openmm
      'HarmonicAngleForce'
      'HarmonicBondForce'
      'NonbondedForce'
      'PeriodicTorsionForce'

      and Lexie is zeroing out the charges to make electrostatic contribution zero

  • CBy – atropisomerism is common in medicinal chemistry

    • e.g. https://en.wikipedia.org/wiki/Indometacin

    • The interactions that hinder rotation aren’t 1-4s, they are 1-5s, 1-6s

    • where the high steric interactions are a problem is where we parameterize torsions. The high sterics need to be screened out for the torsions, but it’s critical that the FF can account for these energies.

    • CBy likes the dexp form for softening the hard repulsions that lead to the high LJ (e.g. 12 kcal/mol). To get the minimum correct, we end up with the excessively high energies we see here

    • CBy – we need a way to encounter high steric interactions with pharmaceutical relevance, e.g. torsional transitions that we should get right

  • BS – we saw a lot of C=O interacting with S. It looked a lot like they were electrostatically attractive when the FF would make them repulsive. We found a lot of FF energies were messed up with these interactions

    • CBy – virtual sites would solve this – aromatic sulfur sigma holes are important here

  • CBy – you probably don’t want LJ interactions beyond 1-4, 1-5s for training torsions. The exception is if the SMIRKS specifies more – e.g. if there’s a methyl coming off an aromatic N, you know there could be key LJ 1-6+ interactions.

    NEB update

    Bill

    • View file
      nameLookAtNEB.pptx

    • LPW – @Bill, one comment I have on your contour plots, (I really like them!), is that the TD path might not be perpendicular to the contours. I think maybe the contours should be perpendicular to the driving angle in the TD pathway, because that is the constrained DoF. Does that make sense?

      • I think maybe the contours should be vertical or parallel to the y-axis at the TD path. (Sorry again, I can't speak without waking people up.)

    • CBy – use WBO for TD scan?

      • Potentially for every geometry with a WBO that changes significantly, we drive in the other direction instead. This also lets us avoid doing a full 2D scan

      • Would expect all conjugated single bonds to exhibit this behaviour

    • LPW – I wonder if it is possible to compute a 2D PES by scanning both the driving angle and the pucker (which could be done with geometric or with TD)? This would depend on there not being other DoFs in the system that could flip.

      • CBy – one issue could be that the improper term could become dependent on the proper, and we may need cross-terms for this

      • LPW – I agree with Christopher that we would want to do it as a case study here, and potentially as something to guide improving the functional form, but not for all the training molecules because the cost would be too high.

      • CBy – BS/TG, could you do regular driving from the other direction?

        • TG – yes

        • CBy – do you compute WBOs?

        • TG – yes, they’re free

    • TG – this initially started as a comparison between TD and NEB

      • I did AIMD on this molecule and I do see this bending at 300 K, so I know this is legitimate behaviour on part of TD, but have not been able to reproduce the bending with NEB, even with 500(?) images

      • In general I think TD is doing the right thing here, or at least sampling what we need

      • BS – what angles are you not sampling?

      • TG – if I only fit an FF on the NEB data, I wouldn’t capture any pucker

      • TG – current hypothesis is the geometric is the optimizer finding these images, not Psi4. Luckily geometric has NEB implemented so will see if it can find these puckers

    Action items

    •  

    Decisions

    ...

    Updates on NEB

    TGokey

    • Slides will be uploaded

    • Slide 3: dashed line shows driving different atoms for the same bond

    • Slide 4:

      • CB: if you let a ball roll downhill from the saddle point, it wouldn’t follow the TD path

      • BS: isn’t the minimum energy pathway TD?

      • TG: yes

      • CB: I’d be looking at perpendicular to the isocontour

      • JW: are they all minimum pathways? they all go to the same minimum through the same maximum

      • CB: Is the question that we’re asking: why are the two pathways different?

      • TG: yes, in part

      • BS: because there are springs connecting the nodes, my intuition is that there’s tension on the NEB making it more straight

      • CB: but is a bigger question which one gives better inputs for parametrizing FF?

      • TG: yes, we want to know if NEB offers up new information we could use, or generally characterise differences between NEB and TD

      • JW: if I had to guess which one would be more useful for fitting, I’d say TD; NEB doesn’t explore OOP bend at all

      • TG: I haven’t fit any actual FFs on this data, but results of FFs fit to NEB would have the improper be artificially stiff

    • Slide 8:

      • the arrows reflect the contribution of the angles to the gradient / the arrows are the force projected onto the two torsions

      • BS: are these energy gradients or forces? Why are the arrows pointing uphill?

      • TG: I think what this is saying that it wants the sp2 C to be more sp3-like

    • DLM: with a FF fit, our impropers might be so coarse that we might see no differences or poor results

      • CB: the data shows the improper and torsion are coupled. The OOP is very soft at 90 degrees, but stiff at 0 degrees. Isn’t this just coupling, which we don’t do

      • BS: at 90 degrees you’re 10 kcal above the minima, it might not matter

      • CB: I agree, it’s more important to have correct behaviour around the minima. But until we couple torsions, not sure how we can see differences from the NEB deformation

      • DLM: might be ok for our impropers to be softer in general. We saw before in general that our impropers were too stiff

      • CB: Yes, but if we have perfect stiffness around the minimum, it still wouldn’t be soft enough around the saddle point

      • DLM: shares heatmaps of valence angle vs improper angle – will share data, this would be important to eventually look at

      • BS: functional form is a single minimum?

      • DLM: yes, but can easily change it to have multiple minima

      • JW: …

      • DLM: we could allow minima away from zero

      • DLM: somewhere I have what we thought the form should be, we don’t need to change FF engines

      • TG – would look at 1-4s with impropers

    • Slide 9:

      • TD may be on a saddle point of a low energy, since derivative of total energy plot doesn’t match total gradient, but energy derivatives do match angle gradients

    • Slide 5:

      • CB: are we fitting to both energy and gradient? In theory both NEB and TD should give the same FF if we do

      • PB: no, we only fit to energies with TorsionProfile targets. If AbInitio we could do both

      • CB: if we fit the hessian at critical points e.g. 90 degrees, that would be very informative. Also if we fit forces at the same points as energies, in principle it shouldn’t matter if we sample the TD or NEB pathway

      • BS: in fact if we fit to forces you want to be off the minimum to get all the information

      • PB: fitting to ICHs is available in FB

      • TG: probably would want both equilibria and off-equilibria

      • CB: there’s a lot of ways to go off-equilibria, some of which can be suboptimal. The nice thing about your NEB is that the sampling of the pathway is quite good

      • TG: would worry that if we only fit to TD forces, or use TD forces first, we might end up with weird results

    • BS / DLM: eager to see results of FF fit

      • TG: have been putting this off as not sure how to quantify improvement

      • JW: we’re working on an internal benchmarking package

      • TG: will it include additional benchmarks?

      • JW: not to start with

      • TG: I’d like to be able to fit to a bunch of conformers and get an overall consensus, current benchmarks fit to QM minima

      • BS: one check could be reproducing the contour map surface

      • TG: yes, although the standard set is 1000 molecules. Storage could get expensive

      • DLM: happy to help figure out what success means (in chat)

    Action items

    •  

    Decisions