2022-08-01 Meeting notes

 Date

Aug 1, 2022

 Participants

  • @Joshua Horton

  • @Pavan Behara

  • @David Mobley

  • @Daniel Cole

  • Venkata

  • @Jeffrey Wagner

  • @Matt Thompson

 Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

20 mins

DEXP fitting results

@Pavan Behara @Joshua Horton

 I have done the valence parameter fits with reduced set of targets, and here are the benchmarks with dexp-fit, and Sage refit for the same set of valence parameters, with the same training targets. The benchmarks are on a subset of industry benchmark set that excludes S, P, F, and I, which brings down the number of molecules to around 37K, from the original 73K set. Both of dexp-fit and Sage-refit almost overlap with each other, but surprisingly vanilla Sage is still better than both. I uploaded the ForceBalance inputs and outputs, and benchmarks here on the dexp repo created by Josh.

  • DM – I think that seeing equivalent results is a really good thing - The previous force fields have had a long time to find a good optimum in parameter space. So the fact that this is getting equivalent results is great

    • DC – Agree. Great. What is the training set for retraining?

    • PB – Sage training set, missing the S+P+F+I molecules. Then tested on the industry benchmark sets.

    • DC – Ok, I see how the test results could get worse

    • JH – This is also why the dE charts look similar to before, because it’s the energies that are being fit to.

  • DC – Could also do a fit of the 1-4 scaling factor?

    • JH – I’ve tested refitting the 1-4 scaling factor for one molecule before, so the question now will be whether this scales.

    • PB – Sure, I can do that.

    • JH – I’ll get in contact about this.

  • MT (chat) – presuambly this is this a global 1-4 factor, not tuning different factors for different torsions?

    • DC – Yes

  • DC – After seeing the data, JH and I had a chat about further tests. But I think that knowing that the training/test sets were different might rule out some of our ideas for followup experiments.

    • JH – It may be interested to run drives of sterically-dominated torsions.

    • PB – I’d looked at sterically-driven torsions before, made a way to decompose energies. So you could use that set.

    • JH + DC – Could look at JACS fragments, using same metrics as bespoke paper.

    • JW – There’s some available capacity for QC compute - Feel free to submit new sets there if you have molecules in mind.

  • DC – JH, it’d be great to run the standard sage phys prop test set.

    • JH – Running an experiment with 10 hydration free energies from the sage benchmarking set, if that works well I’ll run all ~500.

    • JH – Do we have Sage benchmarking results stored anywhere?

    • PB – OMadin would know this.

    • JH – Great. I’ll talk to him. In addition to HFEs there are also transfer free energies.

    • JH – Also as a reminder, we made a DEXP TIP4P model starting from TIP4P-FB. This helped us set global alpha and beta(?). Then I reran water training to get the other parameters right.

    • DC – Yeah, I’d recommend getting pure water properties right before moving on. This will avoid criticism later on.

    • DC – So, basically, take alpha and beta from sage fit, and then fit the remaining parameters during water fitting

    • JH – This seems complex because iterating on optimizations of pure water and the whole training set would cause a bunch of manual iterations. Wouldn’t it be better to train everything together?

    • DC – Let’s try training just the water and see how it looks after an optimization.

  • PB – Why move from TIP3P to TIP4P-FB/TIP4P-DEXP?

    • JH – There’s good training data in ForceBalance for the TIP4P model.

    • DC – Because we’re using a new functional form, we definitely need to retrain the water model, the question is just what we want it to look like.

  • DC – Re: Bespokeift paper, I’m heading to holiday in about a week and a half. So we’ll plan to circulate draft before then, likely the beginning of next week.

  • DM – PB and I and others realized that, when we generated the Sage training data, it was meant to be an EXPANSION of the Parsley training data. But instead it looks like the way we actually used it was as a REPLACEMENT of the Parsley training data. So we expect that the FF may improve more if we refit it to both datasets.

    • DC – That makes sense. I’m keen to use whatever standard is out there for our work.

    •  

  • JW – The bespokefit connectivity change PR is getting near ready, I’m meeting with JMitchell this afternoon. Could we tag you as reviewer when it’s good to go JHorton?

    • JH – Yup, I’ll have a lot of time for this once the paper draft is our.









 Action items

 Decisions