2020-12-18 Chemical Perception meeting notes

Date

Dec 18, 2020

Participants

  • @Tobias Huefner

  • @David Mobley

  • @Pavan Behara

  • @Jeffrey Wagner

  • @Trevor Gokey

  • Caitlin Bannan

  • @Michael Gilson

  • @Jessica Maat (Deactivated)

  • Chris Bayley (CBy)

Goals

  •  

Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

10-15 minutes

 

Trevor

  • Just some updates, discussion points, etc. (no slides)

Short update on recent accomplishments:

  • Code updates:

    • Can read in OFF force field

    • originally splitting FF gradients, but now analyzing geometries directly and then split

    • Use geometric to split force constants, works well for butane example

    • can also reproduce frequencies nicely

    • Has a bit of an over fitting issue currently, need to adapt thresholds

    • JW: Gradient what exactly?

    • TG:Parameter gradient of objective function

    • JW: Are you minimizing in the new force field after updating typing?

    • TG: Yes.

    • MG: You setting the obs bond length to the eq. bond length?

    • TG: Yes, but runs FB after.

    • MG: Does the gradient depend on the number of things you are fitting?

    • TG: Haven't looked at explicitly, currently don't have a good answer for that.

    • JW: IS there a way to change make bit vectors canonical? Can we distinguish CO and OC in typing? Make bonds canonical?

    • CB: Might be relevcant when OR’ing things together. Could you have some sort of ordering?

    • TH: Don’t have ordering implemented yet. Many issues arise, like what happens with hybridization. Permutations complicated.

    • JW: Can you do reordering afterwards? There are multiple ways of getting to the same solution.

    • CB: Not an easy thing to do, many things to do. Branching is complicated to make canonical.

    • CB: Can I back and forth from smarts to bits?

    • TG: yes

    • JW: Most specific things are at the bottom of the tree, right? Isn't it somewhat ambigious which parameter is most specific?

    • TG: Somewhat yes, but it is ordered and there are rules (something, something).

    • TG showing typing tree

    • JW: We could take something like gaff and make a typing tree from it.

    • TG: might be difficult to do with the bit vectors, because gaff types might have several recursive subtle things to consider

    • CBy: What about chirality in your bit strings, would that be possible?

    • JW: There is a way of defining this based on smiles logic that gets around of problems with the R/S formalism

    • CBy: Can infrastructure alrady handle something like that?

    • TG: Basically taking information/symbols from Chemper, would be possible I think

    • CB: Would be difficult, would need to change things in champer

    • CBy: How far around of an atom do you look for smarts definitions?

    • TG: Currently not atoms beyond atoms in bond considered explicitly in bit. It is hard to do, will do later.

    • TG: Once this works, I’d like to run this for a whole dataset. May lead to better force constants with Leepings code (geometric).

    • CBy: Would be good to do this on many molecules.

    • TG: Want to expand this to whole dataset

    • CBy: Difference in Butane CH2 and CH3 will only be visible when looking at many molecules, not Butane alone. Can you group molecules together based on chemistry rules? Only one molecules maybe not enough, too much overfitting.

    • TG: Use existing FF as a reference. Grouping based on bits. If there is lots chemistry in one parameter, there should be grouping in data (something something)

    • JW: Parameter is also its position, not only pattern.

    • TG: Only join things adjacent to each other. Cannot really be done another way

    • JW: Lets say make a FF out of 5/6 smirks, hierarchy matters. Can you technology produce all permutations or only specific?

    • TG: Its on the list to consider permutations as well, yes.

    • CBy: You think we will have different minima in “chemistry space”? This could be an issue later, but should be kept in mind.

    • TG: Would be good to consider environment.

    • CBy: Are you still looking at gradient splitting?

    • TG: Still works, but computationally not cheap.

    • JW: Would be important to load an existing FF and do things if want to use it.

    • TG: Will produce lots parameters . Will have to look at thresholds.

    • CB: Have you thought about using validiation/training sets?

    • TG: Something I’m looking at, want to check individual splits on validation set.

    • CBy: Generally, you should soon start with groups of molecules. There is lots of science behind when to split.

    • TG: Will soon look into it.

    • CBy: You could look at thing like Akaike information criterion to punch number of parameters

    • CB: will check why charge does not show in patterns. Chirality will be more difficult though.

    • Next Meeting on Jan 8th, 2021









Action items

Decisions