ForecastDataset#
- class openstef_core.datasets.ForecastDataset(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), forecast_start: datetime | None = None, target_column: str = 'load', *, horizon_column: str = 'horizon', available_at_column: str = 'available_at', standard_deviation_column: str = 'stdev') None[source]#
Bases:
TimeSeriesDatasetTime series dataset containing probabilistic forecasts with quantile estimates.
Contains forecast results with column names following quantile naming convention (e.g., ‘quantile_P50’ for median). Enables consistent handling of probabilistic forecasts with uncertainty quantification.
Invariants
All columns must be valid quantile strings (e.g., ‘quantile_P10’)
Inherits all TimeSeriesDataset guarantees (sorted timestamps, consistent intervals)
- Attrs:
forecast_start: Timestamp indicating when the forecast period starts. quantiles: List of Quantile values represented in the dataset.
Example
>>> import pandas as pd >>> import numpy as np >>> from datetime import timedelta >>> forecast_data = pd.DataFrame({ ... 'load': [100, np.nan], ... 'quantile_P10': [90, 95], ... 'quantile_P50': [100, 110], ... 'quantile_P90': [115, 125] ... }, index=pd.date_range('2025-01-01', periods=2, freq='h')) >>> dataset = ForecastDataset(forecast_data, timedelta(hours=1)) >>> len(dataset.quantiles) 3 >>> dataset.quantiles[1] 0.5
See also
TimeSeriesDataset: Base class for time series datasets. ForecastInputDataset: For preparing forecasting input data. Quantile: Type for handling quantile values and naming conventions.
- Parameters:
data (
DataFrame)sample_interval (
timedelta)forecast_start (
datetime|None)target_column (
str)horizon_column (
str)available_at_column (
str)standard_deviation_column (
str)
- __init__(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), forecast_start: datetime | None = None, target_column: str = 'load', *, horizon_column: str = 'horizon', available_at_column: str = 'available_at', standard_deviation_column: str = 'stdev') None[source]#
Initialize a time series dataset.
The dataset automatically detects whether it’s versioned based on column presence: - If horizon_column exists: versioned by forecast horizon - If available_at_column exists: versioned by availability time - Otherwise: regular time series
- Parameters:
data (
DataFrame) – DataFrame with DatetimeIndex containing the time series data.sample_interval (
timedelta) – Fixed interval between consecutive data points.horizon_column (
str) – Name of the column storing forecast horizons.available_at_column (
str) – Name of the column storing availability times.is_sorted – Whether the data is sorted by timestamp.
data
sample_interval
forecast_start (
datetime|None)target_column (
str)horizon_column
available_at_column
standard_deviation_column (
str)
- Raises:
TypeError – If data index is not a pandas DatetimeIndex or if versioning columns have incorrect types.
-
forecast_start:
datetime#
-
target_column:
str#
-
quantiles:
list[Quantile]#
- property target_series: Series | None#
Extract the target time series from the dataset.
- Returns:
Time series containing target values with original datetime index.
- property median_series: Series#
Extract the median (50th percentile) forecast series.
- Returns:
Time series containing median forecast values with original datetime index.
- Raises:
MissingColumnsError – If the median quantile column is not found.
- property standard_deviation_series: Series#
Extract the standard deviation series if it exists.
- Returns:
Time series containing standard deviation values with original datetime index.
- Raises:
MissingColumnsError – If the standard deviation column is not found.
- property quantiles_data: DataFrame#
Extract DataFrame containing only the quantile forecast columns.
- Returns:
DataFrame with quantile columns and original datetime index.
- filter_quantiles(quantiles: list[Quantile]) Self[source]#
Select a subset of quantiles from the forecast dataset.
- Parameters:
quantiles (
list[Quantile]) – List of Quantile values to select.quantiles
- Returns:
New ForecastDataset containing only the specified quantile columns.
- Return type:
Self
- to_pandas() DataFrame[source]#
Convert the dataset to a pandas DataFrame with metadata stored in attrs.
Stores sample_interval, available_at_column, and horizon_column in the DataFrame’s attrs dictionary for later reconstruction.
- Return type:
DataFrame- Returns:
DataFrame with dataset data and metadata in attrs.
- classmethod from_timeseries(dataset: TimeSeriesDataset, target_column: str = 'load', forecast_start: datetime | None = None) Self[source]#
Create ForecastInputDataset from a generic TimeSeriesDataset.
- Parameters:
dataset (
TimeSeriesDataset) – Input TimeSeriesDataset to convert.target_column (
str) – Name of the target column to forecast.forecast_start (
datetime|None) – Optional timestamp indicating forecast start.dataset
target_column
forecast_start
- Returns:
Instance of ForecastInputDataset with specified target column.
- Return type:
Self