Datasets and shipped artifacts

elcf4R package

Overview

elcf4R now supports four household-oriented public data sources through a common normalized panel schema:

The package also ships compact example panels and saved benchmark results so the main vignettes run without external downloads. Raw source files stay in data-raw/ and are not redistributed through the package unless a compact derived artifact has been explicitly built and saved.

Two additional unshipped scaffolds are also available:

Supported dataset matrix

The current dataset surface is:

Dataset Reader Resolution Temperature in normalized panel Shipped example Shipped benchmark
iFlex elcf4r_read_iflex() hourly yes elcf4r_iflex_example elcf4r_iflex_benchmark_results
StoreNet (H6_W) elcf4r_read_storenet() 1 minute optional, source-dependent elcf4r_storenet_example elcf4r_storenet_benchmark_results
Low Carbon London elcf4r_read_lcl() 30 minutes no elcf4r_lcl_example elcf4r_lcl_benchmark_results
REFIT elcf4r_read_refit() user-selected resample no elcf4r_refit_example elcf4r_refit_benchmark_results
ELMAS not part of the common household reader set hourly no elcf4r_elmas_toy none

All four household readers return the same core columns:

Dataset-specific metadata columns are preserved when available.

Scaffolded, unshipped datasets

IDEAL and GX are intentionally documented separately from the core shipped household matrix.

Dataset Helper surface Level Current scaffold scope Shipped example Shipped benchmark Licence note
IDEAL elcf4r_download_ideal(), elcf4r_read_ideal() household aggregate-electricity hourly summaries from auxiliarydata.zip no no the current Edinburgh DataShare record states CC BY 4.0
GX elcf4r_download_gx(), elcf4r_read_gx() transformer/community SQLite or flat-export normalization to the common panel schema no no treat licence terms as dataset-record specific and recheck before redistribution

Notes:

Shipped example panels

The shipped examples are small normalized panels intended for package examples and vignette code.

example_sizes <- Filter(
  Negate(is.null),
  list(
    .elcf4r_example_size_row("elcf4r_iflex_example", "iflex"),
    .elcf4r_example_size_row("elcf4r_storenet_example", "storenet"),
    .elcf4r_example_size_row("elcf4r_lcl_example", "lcl"),
    .elcf4r_example_size_row("elcf4r_refit_example", "refit")
  )
)

if (length(example_sizes) == 0L) {
  data.frame()
} else {
  do.call(rbind, example_sizes)
}
#> data frame with 0 columns and 0 rows

These objects can be passed directly to:

Shipped benchmark result datasets

Each supported household dataset now has a saved benchmark-result object built from a fixed local cohort and a deterministic rolling-origin design.

benchmark_summary <- Filter(
  Negate(is.null),
  list(
    .elcf4r_benchmark_summary("elcf4r_iflex_benchmark_results", "iflex"),
    .elcf4r_benchmark_summary("elcf4r_storenet_benchmark_results", "storenet"),
    .elcf4r_benchmark_summary("elcf4r_lcl_benchmark_results", "lcl"),
    .elcf4r_benchmark_summary("elcf4r_refit_benchmark_results", "refit")
  )
)

if (length(benchmark_summary) == 0L) {
  data.frame()
} else {
  benchmark_summary <- do.call(rbind, benchmark_summary)
  benchmark_summary[, c("dataset", "method", "nmae", "nrmse", "smape", "mase")]
}
#> data frame with 0 columns and 0 rows

These shipped benchmark tables are poster-style artifacts. They are not intended to replace full local benchmarking on the raw datasets.

Rebuilding the shipped artifacts

Each shipped dataset is reproducible from a data-raw/ script:

The general pattern is:

  1. Place the original raw files in data-raw/.
  2. Read them through elcf4r_read_*().
  3. Build a normalized day index with elcf4r_build_benchmark_index().
  4. Save a compact example panel.
  5. Run elcf4r_benchmark() on a fixed cohort and save the result table.

This keeps the package lightweight while making the shipped examples and benchmark summaries reproducible.

Example: daily segments from a shipped panel

if (exists("elcf4r_iflex_example", inherits = FALSE)) {
  iflex_segments <- elcf4r_build_daily_segments(
    elcf4r_iflex_example,
    carry_cols = c("dataset", "participation_phase", "price_signal")
  )

  dim(iflex_segments$segments)
  head(iflex_segments$covariates[, c("entity_id", "date", "temp_mean", "price_signal")])
} else {
  data.frame()
}
#> data frame with 0 columns and 0 rows

Example: rerun a tiny benchmark locally

if (exists("elcf4r_lcl_example", inherits = FALSE)) {
  tiny_index <- elcf4r_build_benchmark_index(
    elcf4r_lcl_example,
    carry_cols = "dataset"
  )

  tiny_benchmark <- elcf4r_benchmark(
    panel = elcf4r_lcl_example,
    benchmark_index = tiny_index,
    methods = c("gam", "kwf"),
    cohort_size = 1,
    train_days = 10,
    test_days = 2,
    include_predictions = FALSE
  )

  tiny_benchmark$results[, c("entity_id", "method", "test_date", "nmae", "mase")]
} else {
  data.frame()
}
#> data frame with 0 columns and 0 rows