2024-08-12 Chemical Perception meeting notes

Participants

@Brent Westbrook (Unlicensed)
@David Mobley
@Alexandra McIsaac
@Chapin Cavender
@Lily Wang
@Trevor Gokey

Goals

Discussion topics

Item	Presenter	Notes

Item	Presenter	Notes
		Slides will be uploaded TG: examples in besmarts are currently recently broken, but should be fixed very soon (hours after this meeting) TG: number of bonds and angles is higher than Sage, but comparable level of magnitude TG: with torsions – splitting on periodicities finished, but finishing on k values did not. Got to ~4k torsions. Reduced chi values indicate we could be overfitting data. TG: split n finished with 400 torsions, split k split them further. 4k in split k. LW: so Sage has fewer torsions but is more overfit than Split n? TG: gives possible explanation ~5 min into recording. It may be because the periodicities of Sage are re-set and a threshold was set at 5 (for k?), and many torsions of the Sage set may just go to 5. TG: BES is Split n, BESv2 is Split k set. DM: are you training and testing on the same data, or is it transferrable? TG: I am fitting to Gen 2 and looking at performance on that as well. I may look at splitting up the Gen2 dataset to do some cross validation DM: If training and testing to the same set, may wind up overfitting without seeing it DM: could fit to hessians and test on other benchmarks TG: some issues, Industry benchmark set has more sulfonamides than training set for example. LW: XFF 20% dataset has reasonable coverage of ChEMBL TG: MSM requires Hessians so we should still be generating them TG: on fitting protocol – what do people think of replacing torsion drives with ab initio targets LW: … AMI: my experiments involved replacing all targets with AI targets, saw worse performance, but I didn’t use torsiondrive data. BW: I ran some experiments with smee, one that broadly repeated AMI’s experiment (without TD data), one with. Saw generally worse performance but improvements with TD CC: My experience is that using Ab Initio targets instead of TD gave worse results. I tried switching to a pairwise energy target and that appeared to give improved performance. TG: what was the relative weight compared to other objectives? CC: … (~48 min into recording) TG: sounds like I could get away with single points?

Meetings

2024-08-12 Chemical Perception meeting notes

Participants

Goals

Discussion topics

Action items

Decisions

Related content