2021-07-21 BespokeFit meeting notes

Date

Jul 19, 2021

Participants

  • @Joshua Horton

  • @Mateusz Bieniek

  • @David Mobley

  • @Daniel Cole

  • venkat a

  • @Pavan Behara

  • @Chris Ringrose

  • @Simon Boothroyd

  • @Jeffrey Wagner

Goals

  •  

Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

15 mins

Bespokefit smirks bug

@Joshua Horton

  • https://openforcefield.atlassian.net/wiki/spaces/FF/pages/2017296385

  • DC – Is there some way to check/raise an error if a torsion doesn’t map back to the parent?

    • JH – Yes, I plan to implement this.

  • SB – I was wondering if there’s a way to use the mapping to do this check?

    • JH – I’m using chemper to make cluster graphs/MCSS to map back to the parents.

    • SB – This sounds good.

    • JH – The fragmenter fragments should be able to map abck to the parents. I think there’s an edge case in here, but I’m working on

  • AV – Since you’re mapping every fragment back to the parent, will this add significatnly to computational cost?

    • JH – It’s pretty quick right now, but it may get more expensive as they get more complex.

    • DM – If we look at large libraries of molecules, or large polymers, then the computational cost may become significant

    • JH – The underlying cost here is MCSS using RDKit

  • AV – We’re running a large benchmark right now, so we’ll see whether this problem manifests. I should be able to share this, but the timeline is uncertain since someone else is running it.

  • DC – We’re planning on running the Schrodinger JACS set, does that overlap with AV’s set?

    • AV – We’re looking at molecules that are neutral and ANI-compatible. I’m not sure whether there will be overlap.

    • DC – Did the ANI torsion scans complete?

    • AV – Yes, though I’ve noticed that sometimes the fitting doesn’t complete properly. But sometimes we have to rerun the geometry optimization. So I’m not sure how to automatically detect the cases that require a re-run. I’ll be looking into this.

15 mins

Thoughts on Executor refactor

@Simon Boothroyd

  • SB – It’d probably be an improvement to use a more RESTful approach.

    • The executor class is particularly well-suited for breaking into microservices. Right now it’s got a bunch of multiprocessing use internally, which makes it slow/complex. So I propose that we break it into a bunch of microservices using RESTful apis.

    • Each microservice would have its own storage and backend, which should make things easier to record and debug.

    • JH – This is super cool. So you’d set up a kinda persistent server/set of servers?

      • SB – Yes

    • SB – Other part of proposal is to use QCEngine directly instead of QCFractal. This is similar to my QCEngine PR #305.

    • JH – Would this be a slight duplication of effort? We’d need to run our own molecule database

      • SB – It would be slightly duplicative, but we’d have control in the ways that we need (for example, identifying molecules by SMILES instead of geometry)

    • Should this go into the initial bespokefit release?

      • (General) – This should make it simpler than going through a bunch of multiprocessing queues and a QCF snowflake

    • Should we put this through QCEngine or geopt?

      • QCEngine is kinda unmaintained/slow on PR reviews. So we should either use this as a plugin or put it into geopt

10 mins

Meaning of negative force constants

@Daniel Cole @Mateusz Bieniek

  • MB https://docs.google.com/presentation/d/1oBGYM95Fnb0spjVLmIZ4aHw14SGbP3YF-mRwbuQXNAM/edit#slide=id.ge5e6fa3bf3_0_0

  • DC – MB has downloaded data from QCArchive and is feeding them into espaloma. Things look promising so far, and I’ll talk about that separately. But, we’re seeing negative bond k’s when fitting using the modified seminario method, mostly in nitrogen-oxygen bonds

  • DC – I’m not sure how to interpret this. I found a paper on this that explained that a negative force constant in a ring can actually improve geometries.

  • For example, in molecule below, the 2.2 A bond seen below in MM reached a length of 1.6A in QM

  • DM – How severe is this? We’ll probably want to address this before we try to publish anything, and we can bring in Chris Bayly

  • PB – Did we use modified seminario in fitting sage?

    • SB – We looked into it, but didn’t end up using modified seminario in sage

  • DC – SB, could new data + the espaloma stuff go into a new paper?

    • SB – We’re pretty bandwidth-constrained right now, but this seems like a good idea and you could go ahead with it.

Action items

Decisions