ForecastDataset#

class openstef_core.datasets.ForecastDataset(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), forecast_start: datetime | None = None, target_column: str = 'load', *, horizon_column: str = 'horizon', available_at_column: str = 'available_at', standard_deviation_column: str = 'stdev') None[source]#

Bases: TimeSeriesDataset

Time series dataset containing probabilistic forecasts with quantile estimates.

Contains forecast results with column names following quantile naming convention (e.g., ‘quantile_P50’ for median). Enables consistent handling of probabilistic forecasts with uncertainty quantification.

Invariants

  • All columns must be valid quantile strings (e.g., ‘quantile_P10’)

  • Inherits all TimeSeriesDataset guarantees (sorted timestamps, consistent intervals)

Attrs:

forecast_start: Timestamp indicating when the forecast period starts. quantiles: List of Quantile values represented in the dataset.

Example

>>> import pandas as pd
>>> import numpy as np
>>> from datetime import timedelta
>>> forecast_data = pd.DataFrame({
...     'load': [100, np.nan],
...     'quantile_P10': [90, 95],
...     'quantile_P50': [100, 110],
...     'quantile_P90': [115, 125]
... }, index=pd.date_range('2025-01-01', periods=2, freq='h'))
>>> dataset = ForecastDataset(forecast_data, timedelta(hours=1))
>>> len(dataset.quantiles)
3
>>> dataset.quantiles[1]
0.5

See also

TimeSeriesDataset: Base class for time series datasets. ForecastInputDataset: For preparing forecasting input data. Quantile: Type for handling quantile values and naming conventions.

Parameters:
  • data (DataFrame)

  • sample_interval (timedelta)

  • forecast_start (datetime | None)

  • target_column (str)

  • horizon_column (str)

  • available_at_column (str)

  • standard_deviation_column (str)

__init__(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), forecast_start: datetime | None = None, target_column: str = 'load', *, horizon_column: str = 'horizon', available_at_column: str = 'available_at', standard_deviation_column: str = 'stdev') None[source]#

Initialize a time series dataset.

The dataset automatically detects whether it’s versioned based on column presence: - If horizon_column exists: versioned by forecast horizon - If available_at_column exists: versioned by availability time - Otherwise: regular time series

Parameters:
  • data (DataFrame) – DataFrame with DatetimeIndex containing the time series data.

  • sample_interval (timedelta) – Fixed interval between consecutive data points.

  • horizon_column (str) – Name of the column storing forecast horizons.

  • available_at_column (str) – Name of the column storing availability times.

  • is_sorted – Whether the data is sorted by timestamp.

  • data

  • sample_interval

  • forecast_start (datetime | None)

  • target_column (str)

  • horizon_column

  • available_at_column

  • standard_deviation_column (str)

Raises:

TypeError – If data index is not a pandas DatetimeIndex or if versioning columns have incorrect types.

forecast_start: datetime#
target_column: str#
quantiles: list[Quantile]#
property target_series: Series | None#

Extract the target time series from the dataset.

Returns:

Time series containing target values with original datetime index.

property median_series: Series#

Extract the median (50th percentile) forecast series.

Returns:

Time series containing median forecast values with original datetime index.

Raises:

MissingColumnsError – If the median quantile column is not found.

property standard_deviation_series: Series#

Extract the standard deviation series if it exists.

Returns:

Time series containing standard deviation values with original datetime index.

Raises:

MissingColumnsError – If the standard deviation column is not found.

property quantiles_data: DataFrame#

Extract DataFrame containing only the quantile forecast columns.

Returns:

DataFrame with quantile columns and original datetime index.

filter_quantiles(quantiles: list[Quantile]) Self[source]#

Select a subset of quantiles from the forecast dataset.

Parameters:
  • quantiles (list[Quantile]) – List of Quantile values to select.

  • quantiles

Returns:

New ForecastDataset containing only the specified quantile columns.

Return type:

Self

to_pandas() DataFrame[source]#

Convert the dataset to a pandas DataFrame with metadata stored in attrs.

Stores sample_interval, available_at_column, and horizon_column in the DataFrame’s attrs dictionary for later reconstruction.

Return type:

DataFrame

Returns:

DataFrame with dataset data and metadata in attrs.

classmethod from_timeseries(dataset: TimeSeriesDataset, target_column: str = 'load', forecast_start: datetime | None = None) Self[source]#

Create ForecastInputDataset from a generic TimeSeriesDataset.

Parameters:
  • dataset (TimeSeriesDataset) – Input TimeSeriesDataset to convert.

  • target_column (str) – Name of the target column to forecast.

  • forecast_start (datetime | None) – Optional timestamp indicating forecast start.

  • dataset

  • target_column

  • forecast_start

Returns:

Instance of ForecastInputDataset with specified target column.

Return type:

Self