EnsembleForecastDataset#
- class openstef_core.datasets.EnsembleForecastDataset(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), forecast_start: datetime | None = None, target_column: str = 'load', *, horizon_column: str = 'horizon', available_at_column: str = 'available_at') None[source]#
Bases:
TimeSeriesDatasetFirst stage output format for ensemble forecasters.
- Parameters:
data (
DataFrame)sample_interval (
timedelta)forecast_start (
datetime|None)target_column (
str)horizon_column (
str)available_at_column (
str)
- __init__(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), forecast_start: datetime | None = None, target_column: str = 'load', *, horizon_column: str = 'horizon', available_at_column: str = 'available_at') None[source]#
Initialize a time series dataset.
The dataset automatically detects whether it’s versioned based on column presence: - If horizon_column exists: versioned by forecast horizon - If available_at_column exists: versioned by availability time - Otherwise: regular time series
- Parameters:
data (
DataFrame) – DataFrame with DatetimeIndex containing the time series data.sample_interval (
timedelta) – Fixed interval between consecutive data points.horizon_column (
str) – Name of the column storing forecast horizons.available_at_column (
str) – Name of the column storing availability times.is_sorted – Whether the data is sorted by timestamp.
check_frequency – Whether to check that the data frequency matches sample_interval.
data
sample_interval
forecast_start (
datetime|None)target_column (
str)horizon_column
available_at_column
- Raises:
TypeError – If data index is not a pandas DatetimeIndex or if versioning columns have incorrect types.
ValueError – If data frequency does not match sample_interval.
-
forecast_start:
datetime#
-
target_column:
str#
-
forecaster_names:
list[str]#
-
quantiles:
list[Quantile]#
- property target_series: Series | None#
Return the target series if available.
- static get_learner_and_quantile(feature_names: Index) tuple[list[str], list[Quantile]][source]#
Extract base forecaster names and quantiles from feature names.
Column format is
{learner}{ENSEMBLE_COLUMN_SEP}{quantile.format()}, e.g.lgbm__quantile_P50.- Parameters:
feature_names (
Index) – Index of feature names in the dataset.feature_names
- Returns:
Tuple containing a list of base forecaster names and a list of quantiles.
- Raises:
ValueError – If a column cannot be parsed or has an invalid quantile string.
- Return type:
tuple[list[str],list[Quantile]]
- classmethod from_forecast_datasets(datasets: dict[str, ForecastDataset], target_series: Series | None = None, sample_weights: Series | None = None) Self[source]#
Create an EnsembleForecastDataset from multiple ForecastDatasets.
- Parameters:
datasets (
dict[str,ForecastDataset]) – Dict of ForecastDatasets to combine.target_series (
Series|None) – Optional target series to include in the dataset.sample_weights (
Series|None) – Optional sample weights series to include in the dataset.datasets
target_series
sample_weights
- Returns:
EnsembleForecastDataset combining all input datasets.
- Return type:
Self
- get_base_predictions_for_quantile(quantile: Quantile) ForecastInputDataset[source]#
Get base forecaster predictions for a specific quantile.
- Parameters:
quantile (
Quantile) – Quantile to select.quantile
- Returns:
ForecastInputDataset containing predictions from all base forecasters at the specified quantile.
- Return type: