EnergyComponentDataset#
- class openstef_core.datasets.EnergyComponentDataset(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), *, horizon_column: str = 'horizon', available_at_column: str = 'available_at') None[source]#
Bases:
TimeSeriesDatasetTime series dataset for energy generation by component type.
Validates that all required energy component columns (wind, solar, other) are present. Used for energy sector analysis and component-specific forecasting.
Invariants
Must contain columns for all energy component types
Inherits all TimeSeriesDataset guarantees (sorted timestamps, consistent intervals)
Example
>>> import pandas as pd >>> from datetime import timedelta >>> energy_data = pd.DataFrame({ ... 'wind': [50, 60], ... 'solar': [30, 40], ... 'other': [20, 25] ... }, index=pd.date_range('2025-01-01', periods=2, freq='h')) >>> dataset = EnergyComponentDataset(energy_data, timedelta(hours=1)) >>> 'wind' in dataset.feature_names True >>> len(dataset.feature_names) 3
See also
TimeSeriesDataset: Base class for time series datasets. ForecastInputDataset: For general forecasting input data. EnergyComponentType: Enum defining required energy component types.
- Parameters:
data (
DataFrame)sample_interval (
timedelta)horizon_column (
str)available_at_column (
str)
- __init__(data: DataFrame, sample_interval: timedelta = timedelta(minutes=15), *, horizon_column: str = 'horizon', available_at_column: str = 'available_at') None[source]#
Initialize a time series dataset.
The dataset automatically detects whether it’s versioned based on column presence: - If horizon_column exists: versioned by forecast horizon - If available_at_column exists: versioned by availability time - Otherwise: regular time series
- Parameters:
data (
DataFrame) – DataFrame with DatetimeIndex containing the time series data.sample_interval (
timedelta) – Fixed interval between consecutive data points.horizon_column (
str) – Name of the column storing forecast horizons.available_at_column (
str) – Name of the column storing availability times.is_sorted – Whether the data is sorted by timestamp.
data
sample_interval
horizon_column
available_at_column
- Raises:
TypeError – If data index is not a pandas DatetimeIndex or if versioning columns have incorrect types.