2020-03-20 Chemical Perception meeting notes

Date

Mar 20, 2020

Participants

  • @Hyesu Jang

  • @Josh Fass (Deactivated)

  • @Christopher Bayly

  • @David Mobley

  • @Jeffrey Wagner

  • @Jessica Maat (Deactivated)

  • @Karmen Condic-Jurkic

  • @Lee-Ping Wang

  • @Trevor Gokey

  • @Victoria Lim (Deactivated)

  • @Jeffry Setiadi

Record

 

Discussion topics

Item

Presenter

Notes

Item

Presenter

Notes

Typing update

Josh Fass

https://openforcefieldgroup.slack.com/archives/CG59TUTL2/p1584632573062600?thread_ts=1584632557.062500&cid=CG59TUTL2

JF – How do we police SMIRKS differences between toolkits?

JRW – Could police during FF release or during application. For the former, we’d need some sort of automated string analyzer that may get somewhat complex to find “forbidden” SMIRKS patterns. For the latter, we’d need to implement toolkit parameterization difference tests, when are already a goal on the infrastructure roadmap.

(General) JF should make a checklist of bad motifs which we will check against future releases, and replace if they are found.

CIB – For shares bonds SMIRKS, how about (~[*r5,r6])?

 

JF – What is the name for this group of atoms that frequently appears together?

CIB – These are electron withdrawing groups, frequently identified as EWG or EWA.

CIB – F, Cl, Br, I are together “halo” for halogen. NOS are electron-withjdrawing atoms together with “halo”

DLM – what’s the problem we’re looking to solve with message passing? Is SMIRKS manipulation too inefficient?

JF – Yes. Naive manipulation of SMIRKS is too inefficient. Message passing lets us make more structured moves in SMIRKS space.

JRW – Would the output of this be SMIRKS, or would it require a different typing engines?

JF – It wouldn’t immediately become SMIRKS, but we could reverse engineer the final types form this sort of optimization into SMIRKS that reproduce the fitting.

CIB – This seems to have a lot of promise for biopolymers. If a leucine came in with missing atoms, so that it looks like an alanine, would it be recognized as leucine or alanine?

JF – This model doesn’t have support for missing atoms or recognition of polymers by anything except connectivity.

CIB – So, is this most promising for large molecules?

JF – The cost of this typing approach is linear with respect to number of atoms, so it would be good for biopolymers

BenchmarkFF

Victoria Lim

https://docs.google.com/presentation/d/1LXSgV9Q6ewbFw5GT-SnSl3SSeIJF5XRH5M-xg5YHnME/edit#slide=id.g7d18d4eabc_0_5

DLM – OPLS3e doesn’t seem to be trying to do good energetics. They just find problems with minimized geometries and change to model to fix them.

CIB – I’m really happy that we’re doing OK on BOTH geometry and energetics. In fact, we’re doing better on energetics that OPLS3e, even though their geometries

LPW – How do you subtract reference energy for MM and qM?

VTL – Same geometries for both QM and MM. Different selection of reference conformer for different plots.

LPW – WRT to picking reference conformer, I’ve found that reference can lead to that sort of asymmetry of energy deltas. I usually pick lowest energy conformer in QM, and see the asymmetry. I think it’s because, if you have ANY MM energies that are low relative to the QM, then the MM conformations are biased to find those conformations during minimization. If ther’ve OVERESTIMATING the energy, that means that the MM minimization is overly confined to the neighborhood of the QM geometry, even though it energetically doesn’t want to be there.

VTL – I will review my method for picking the conformers and see where the asymmetries appear.

CIB – As opposed to free energy calcs (where asymmetry is indicative of a real problem), here I don’t think it’s so bad. What if we always looked at the differences from lowest energy QM conformer? Or what if we do the relative numbers multiple times for each molecule, taking the difference from EACH conformer of the molecule.

KCJ – 4 fused-ring structs may have bad energies if hydrogens migrated, since then bonding would change

JRW – 1) RMSD calcs shouldn’t be possible if graph representations are different. 2) We should be able to check this against CMILES in the future

CIB – What if MM puts formal charges in the wrong place relative to initial geometry? WBOs may correct for this. In structure shown, the rings on the left are a delocalized system.

HJ – Just finished looking into fingerprinting methods, will share slide in #chemical-perception

LPW – Interested to see how further investigations turn out. I think we should keep optimization/benchmarking infrastructure different.

Meeting Note

 

Action items

@Josh Fass (Deactivated) will make a checklist of “bad patterns” to be checked during FF release, which will block the release if a forbidden SMIRKS in present. @Jeffrey Wagner will receive this checklist and add it to the release process.
Benchmarking dashboard could include ddEs calculated from ALL conformers of a molecule, instead of just one reference. The higher-energy the QM structure, the less it should contribute to “badness” score (forcebalance torsion profiles already taper torsion weights. Optimized geometries don’t, since we have no concept of several conformers fo the same molecule.)
@Hyesu Jang will upload this meeting recording and share it with @Joshua Horton and @Jaime Rodríguez-Guerra (Deactivated)

Decisions