Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • We will be less systematic in the selection of those systems to include in the benchmark set, opting instead to aim to curate a set which has a diverse set of molecules with pure density, enthalpy of vaporization data points, and binary enthalpy of mixing, of excess molar volume, and binary mass density data points, without enforcing that substances must have all such be available to be included (as was the case for the training sets).

  • In order to test how well each of the different produced force fields generalise, we initially aim to include binary mixtures of alcohols and alcohols, alcohols and esters (/ acids), and esters (/acids) and esters (/acids).

    • This will likely be expanded to ethers and other such additional moieties, however this will be done after this initial set has been benchmarked against.

  • In an attempt to ensure that we are testing the performance of the refit parameters, rather than the full Parsley 1.0.0 force field, we will exclude any

    • aromatic compounds

    • compounds containing 3-4 membered rings.

    • compounds containing alkane chains greater than 6 atoms in length.

    again, this will likely be relaxed in future benchmark sets.

  • This set will only contain mixtures whereby neither of the components appear in the training set. Future data sets may then be complement with mixtures which do partially contain training data to further explore interesting results highlighted by this initial set.

Chosen Alcohol-Ester

...

Study Benchmark Set

The results shown in the page where generated against a data set which contained

  • ~110 pure data points (~60:40 split of

    Mathinline
    host9e5865a8-c37e-3de7-b41a-1ad417a001db
    body\rho(pure)
    and
    Mathinline
    host9e5865a8-c37e-3de7-b41a-1ad417a001db
    body--uriencoded--H_%7Bvap%7D
    , ambient condtions) where none of the substances appeared in the test set.

  • ~320 mixture data points (with roughly equal numbers of

    Mathinline
    host9e5865a8-c37e-3de7-b41a-1ad417a001db
    body--uriencoded--H_%7Bmix%7D
    ,
    Mathinline
    host9e5865a8-c37e-3de7-b41a-1ad417a001db
    body\rho(x)
    and
    Mathinline
    host9e5865a8-c37e-3de7-b41a-1ad417a001db
    body--uriencoded--V_%7Bexcess%7D(x)
    data points, ambient conditions, three compositions (~25%, ~50%, ~75%) per pair). This included mixtures where neither component appeared in the test set, and alcohol-alcohol and ester-ester mixtures where both components appeared in the test set (the train set only included alcohol-ester mixtures)

View file
namepure_components.pdf
View file
namemixture_components.pdf
View file
namefull_set.csv

...

The pure substances to include for

Mathinline
host9e5865a8-c37e-3de7-b41a-1ad417a001db
body--uriencoded--\rho_%7Bpure%7D
,
Mathinline
host9e5865a8-c37e-3de7-b41a-1ad417a001db
body--uriencoded--H_%7Bvap%7D
, where then mainly chosen as those components chosen as part of the mixture properties where available, and components close to those where not possible.

Chosen Set

The final set contains ~48 pure data points and ~ 900 mixture data points.

View file
namepure_set_egfsefgse.pdf
View file
nameexcess_molar_volume_binary_egesges.pdf
View file
nameenthalpy_of_mixing_binary_fefgsef.pdf
View file
namedensity_binary_qwffqewfe.pdf

Questions we want to answer via benchmarking/what benchmark sets can we use to achieve this?

  • In general, to what extent are LJ parameters trained on mixture data data transferable to other sets of mixture?

    • The first mixture benchmark set (MB1), consisting of heat of vaporization, mixture density, and excess molar volume of alcohol/ester, alcohol/alcohol, alcohol/acid, ester/acid, acid/acid and alcohol/ether mixtures that do not have any commonality to the mixtures that we trained on.

  • If we train LJ parameters on one type of mixture, how well do those parameters transfer to other types of mixture?

    • Since we included only alcohol/ester mixtures in our training data, MB1 will allow us to look at the transferability of LJ parameters to other types of mixtures.

      • Alcohol/alcohol mixtures: We train on mixtures that contain alcohols, but only mixtures of alcohol with esters. To what extent do the alcohol LJ parameter transfer to mixtures that only include alcohols.

      • Alcohol/ether mixtures: To what extent do the LJ parameters for alcohols, trained on ester mixtures, transfer to mixtures of alcohols and ethers (which have not been trained on at all)? Ethers should be similar to esters, so if they do quite poorly, this will be an issue.

  • How do benchmarked mixture properties vary as a function of composition?

    • Through benchmarking, can we identify the extent that transferability affects mixture properties as a function of composition? For example, if we test against alcohol/ether mixtures and we see worse performance as the ether concentration increases, then maybe the alcohol parameters are good, but the ester parameters don’t transfer well to ethers.

    • MB1 will allow us to explore this, since we should have good coverage in a range of xA=0.2-0.8. We should also consider adding some points in the 0.9-0.95 mole fraction region, to check on that behavior. This could either be as a separate set, or just something we break out in MB1

  • Can we identify a “spectrum of transferability” for parameters in mixtures.

    • For example, within a benchmark set composed of mixtures that have at least one component in the training set (MB2), there are a large number of mixture with tert-butanol. Assuming that tert-butanol is parameterized reasonably well, by examining the mixture properties and chemical similarity of the other moieties in these tert-butanol sets, can we identify how different a mixture can be before the transferability starts to degrade?

  • To what extent are mixture properties transferable from training on pure properties only? To what extent are pure properties transferable from training on mixture properties only?

    • Can we get a sense of the correlation between performance on mixture properties and pure properties (does low error in pure density imply low error in mixture density)? By benchmarking sets parameterized with only pure and only mixture data on MB1 and PB1 (the basic pure data benchmark set), we can analyze this.

  • How do mixtures trained on mixture densities perform on excess molar volumes?

    • By looking at the subset of MB1 that includes excess molar volumes, can we accurately reproduce these by training on mixture densities? If we can’t, that may point to excess molar volumes not being very useful for us.

...