WW: I am splitting on four atom types, I was not sure whether that’s fine or not, and I think TG’s work would answer that.
TG: Slides
MG: What are mean and fit on slide5?
TG: WW gives me per atom polarizabilities, I group them based on my code and give back smarts patterns for all molecules, and then she re-fits them again with the specific smarts groups and the “Fit” column represents the values for these smarts.
CaB: Do you think the alcohols are overshadowed by the carbonyls, or anything like that?
TG: I didn’t look at the molecules at first and ran my code and it seems to work, I can now dig deeper.
CaB: I think it would be great to see starting from elemental info and go upwards.
TG: Another thing WW is planning to do is instead of giving out polarizabilities per atom, she will give me per molecule per atom per electric field.
WW: The molecules we see are 10 molecules but there are 400+ ESPs. Some of the refit values are overlapping with each other.
MG: Is it possible that the fit was typed by something else since the values look lot similar?
WW: I can recheck.
CaB: I see few general parameters are low in order, is that right?
TG: It is in the right order.
CaB: Does the same hierarchy apply to a new dataset?
TG: It applies to only this training data.
WW: I have 39 more molecules with many more ESPs.
CaB: For a proof of concept this looks great!
TG: Sure, 40 seems small but I think it would be a good start.
WW: Previously we were using a longer fitting procedure but I have optimized it now, and it takes less time.
TG: So, its minutes vs. hours, then it’s great!
TG: I never got negative polarizabilities so I think it’s doing good.
WW: If I use all 17 LJ types then I see negative polarizabilities then I had to use psi4 workflow. But, with TG’s parameters I don’t see the negative polarizabilities now.
TG: Another point is, I used a depth of 2, you can branch out to two hops from the atom, which is why the splits might be too specific and there are more parameters. We can cut it down to get more general parameters similar to LJ.
TG: There’s also a second stage of refitting BCCs which may not jive well with these smarts patterns.
MG: They can be compared with liquid phase polarizabilities based on refractive index, etc.
WW: Yeah, I can.
MG: I don’t know if I got a good signal to go in that direction, rather we can move ahead and it looks elegant to use this.
TG: Yeap, I have the code to do the work, so it is up to you and WW on how to make use of it.
TG: Recently AMOEBA people reported a new model based on smarts, its worth checking out.
TG: I will release this code soon and anyone can use this for splitting, even useful to Tobias for his work.
WW: I think I will use the refit polarizabilities to test how good the new smarts set is, then we can do the same on a larger training data, does it sound good?
TG: Yeah, I am looking for a larger training data set, and it would be great if you can test and let me know that these are good or not. But, these were overfit so I am not sure what your analysis would show, but in general cases we can see the improvements.