2025-09-10 Cole Lab Check-In Meeting Notes

2025-09-10 Cole Lab Check-In Meeting Notes

Participants

  1. @Finlay Clark

  2. @Daniel Cole

  3. @David Mobley

  4. @Jeffrey Wagner

  5. @Chapin Cavender

  6. @Matt Thompson

  7. Alice Allen

  8. Julia Rice

  9. Jennifer Clark

Slides

https://newcastle-my.sharepoint.com/:p:/g/personal/nfc78_newcastle_ac_uk/EYApx9ofrBxPsQjBqLfazHoBQs4OLGl8_qNf2kjOzRvg5g?e=xZyZvh

Recording

Recap: (Open) Cole Lab/OpenFF Check-In 10 September | Meeting | Microsoft Teams

Discussion topics

Item

Notes

Item

Notes

Side chain analogues data

DC: Re conversation on slack - data looks pretty good, modulo maybe some things like Trp that aren’t well represnted. This could be a cool mini-project, and we might start working on this if nobody else does. Further requests while they’re talking to NIST.

JW – No feedback, apologies for not being able to speak on behalf of our organization.

 

SMEE-Valence Fits

  • FC will post sides here

  • DC (6) – Has sage seen these mols before?

    • FC – Benchmarked on it, but not trained. Possibly at the edge of the domain of applicability given 10% of mols in dataset weren’t parameterizable

  • CC – When they did sage in paper, did they speicfy howthey did chanrge?

    • FC – Unsure. Would need to check. But failures weren’t due to charge assignment, rather missing SMIRKS.

  • JW (11) – So, ….

    • FC – Right, each mol generally has 1 torsion scanned, and the column/series that the point goes into is determined by the LEAST specific torsion parameter running through the torsiondrive

  • AA (12) Looked at variance in parameters themselves?

    • FC – No, I’ll look into that. I was hoping to avoid parameters cancelling out due to relative energies.

  • CC (12) – Would this metric correlate with barrier height? It could be helpful to normalize ensemble RMSE by barrier height.

    • FC – Good idea

  • DC – Jensen-shannon distance/

    • FC – I looked and got the same qualitative distrubution

  • DC (19) – You’re saying the alkene barriers are still lower than QM?

    • FC – Yes

    • DC – Ok, then do go ahead with plan to fix them.

  • MT – Explain why we need ring types to get non-ring torsions right?

    • FC – When we didn’t separate this out, things in the training set would “contaminate” the angle. sys_bat has about 104 degrees, which is better but not as good as 106.

    • MT – This seems like a good candidate for splitting.

  • CC (20, chat) – I think Trevor's BESMARTS work also found that splitting off ring parameters helps non-ring parameters.

    • JAC – Lily mentioned that Brent did some work to also show that identifying ring atoms is important 

  • DC – Feel free tos end data separately to convince me that we don’t need to look at alkenes. But in the meantime do write itup and turn to torsion angle coupling. Then OpenFF can take the type specificity stuff from there.

  • AA – Do you have a strategy for regularization?

    • FC – No strategy/plans. We seem to be getting a lot of parameter drift, so regularizing to 0 might help, or early stopping once loss flattens off. The fitting does run with a training+test set, and the loss of each is evaluated at every step.

    • DC – The prior fits we did with regularization ended up being pretty rubbish.

    • FC – Maybe regularizing to MSM vals would be better

  •  

    •  

  •  

Posting descent-workflow on socials


Finlay - I mentioned your cool descent-workflow repo at the staff meeting (since it's been really helpful for us learning to do smee fits/team cross-training in general, and a real win for open science). Folks at OMSF want to shout it out:

if you are pinging Finlay, see if he'd be down for a quick shoutout on OMSF socials

And they want to list it in the workflow directory they're populating:

Maybe Finlay’s repo can go to the workflow directory?

(ethan holz) Sounds good, if he wants to chat with me about the directory feel free to give him my email.

On one hand I definitely want you to get credit if you want it, but on the other I don't know if you want to risk randos coming by the issue tracker and asking for support on the repo/workflow. So I told them to hold off on mentioning/listing it until I check with you. 

 

FC – Workflow doesn’t run end-to-end yet, so I wouldn’t want it posted yet in workflow listing. Also, much of the code is copied from Simon and Josh H, and I’d like to acknowledge them. So please hold off until I have a chance to clean this up.

DC – Agree.

JW – Great, I’ll expect to hear from you in a few weeks when it’s ready, and if you don’t want it posted then just never give me the go-ahead.