2021-09-24 WBO/Impropers meeting notes

Date

Sep 24, 2021

Participants

  • @Pavan Behara

  • @Christopher Bayly

  • @David Mobley

  • @Simon Boothroyd

  • @Jessica Maat (Deactivated)

  • @Lily Wang

Goals

  •  

Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

 

 

@Jessica Maat (Deactivated)

  • JM: I couldn’t find any matches from the QCA sets I downloaded.

  • PB: I was talking to SB this week and did a similar search for the purpose of wbo fits, and I could find some instances.

  • DM: It might be some substructure search issue then. You both can sync up on that.

  • JM/PB: Sure.

  • DM: So, we wanted to see all available molecules in the datasets that match this no-ortho-subs biphenyls, and the range of wbos it spans. For PB, we are looking to refit Sage 2.1 for knowledge transfer purposes from SB, and include one or two wbo interpolated torsions as an experiment.

  • CB: I want to see the planarity along with the WBO and torsion barrier.

  • DM: Sounds like something we can do when we’re doing a pass of the datasets.

  • CB: Yeah, plot the torsion angle along that minimum with a heat map. WBO on x-axis, value of torsion angle on y-axis, and a heat map for the density (# of molecule matches).

  • DM: JM can make the heatmap CB was proposing, and PB can try out the fitting experiments.

  • JM: Okay, sure.

  • CB: So, this include the r5-r6 rings as well right? Let’s focus only on the r6-r6 rings.





@Pavan Behara

  • PB: I did a fit with interpolated parameter and I guess it still doesn’t look that great.

  • DM: But, in all your plots the interpolated works better.

  • CB: One thing from the distribution of wbos is that our interpolated parameter may not be good and we should forget about it, another conclusion is that this industry benchmark data is not representative of all the chemistry space we are considering. I can quote the JACS 15 dataset, the beta secretase case, all those analogs are biphenyls, they did with the same kind of suzuki and at least 3 quarters of them do not have any ortho subs. So, this experiment tells us that we are looking at a bad dataset that doesn’t cover much.
    Also, whether the problem is with AM1 or fitting procedure, and if the fit is done in a correct way.

  • PB: Okay, will recheck the fitting process.

  • DM: I think we should still care about since we see the difference between interpolated and general fits.

  • SB: Yeah. I agree with CB, we should check if the results have some noise due to bad fitting.

  • DM: Yeah, also we should check if we need sigmoidal interpolation.

  • CB: May be we have lot of data accumulated around 1, we can bin them. Also, the r5-r6 would be different from r6-r6.

  • SB: Yeah, I agree.

  • DM: Yeah, clean the data and do a refit.

  • PB: Will redo the fits by checking out the outliers and binning the training data.

 

 

 

  • SB: Let’s map out the wbos for drug-like molecules from QCA datasets.

  • DM: JM, you can take the code from PB, and look at David Hahn’s PL Benchmarks and find the distributions of wbos.

  • CB: You can look at the pharmaceutically relevant molecules from industry benchmark as well as ChemBL. That would be a great resource.

  • DM: JM did some work with e-molecules before, that’s also a good source.

Action items

Decisions