| | | |
---|
15-20 minutes | Gradient-informed type-splitting | @Josh Fass (Deactivated) | CCB – If that were combined with Chemper-proposed patterns, you might get more reasonable guesses at types DM – GB parameters are a nice toy system, but I’ll be more convinced by bonds. Though I understand that single-atom parameters are easier since there are less degrees of freedom CIB – Like that this is a good proof of principle. And it’s good that the current direction will hit the “complexity explosion” and naturally need to integrate Chemper to constrain the space of possible SMIRKS eloborations. DM – TG and I have been looking at separating into gaussians, and using molecules which are in the non-overlapping part of the gaussians to develop types. CCB – I think Chemper could handle problems at this scale. CIB – Maybe we could test these proposed solutions by seeing if they could reproduce HJ’s work/decisions over the last year.
Slides: |
15-20 minutes
| Physics-based typing with Atoms-in-Molecules and Gaussian mixtures | @Tobias Huefner | CIB – I saw Bader talk a few times about his invention of the AIM approach. One interesting thing in the decomposition scheme. You’re trying to bring value by bringing quantum descriptors instead of cheminformatics descriptors. We used the same sort of thing with WBOs (filling in a shortfall of cheminformatics representations using QM information). You’re also looking to bridge the physics and chemistry using clustering schemes. Having tried to do this sort of thing before, I’ve found lots of pitfalls. Could we find a simple toy system for this sort of thing? Maybe JF’s set?
TH – Agree. I was planning to start with LJ type distinction, but I could do GB or charge like JF. JW – Shared test sets and benchmarking infrastructure?
|
15-20 minutes | Atom typing using set theory | @Trevor Gokey | CIB – I like this direction. Three questions/comments: You are going to have to make a representation of bonds. One key advance that we want to bring into the FF is to move away from integer bond orders into widespread use of WBOs. Floating points will be hard to represent in a bit vector. TG – There’s a bit vector for the bond as well. DM – We would plan for more of these to be replaces by WBOs CB – Is there a representation for “any” bond order? TG – Yes – Bond bit vector “11111…”
How will you handle an atom which is described by what it’s bound to? SMARTS can be recursive, which could increase complexity. Bit vectors overdefine things which are mutually exclusive: An atom can be X1, X2, X3, or X4, but not many of them. Will this affect representation? Is this unnecessarily complicated/high dimensional?
CIB- how to analyze the data? There are several methods out there you can choose. within data in bit vector, using some grouping scheme, like random forest, to figure out what is common among them. DT has a feature selection: what the recurring theme is in the tree. Action items postponed to next week
|