SimpleTargetProvider#

class openstef_beam.benchmarking.SimpleTargetProvider(**data: Any) None[source]#

Bases: TargetProvider[TypeVar, TypeVar], Generic

File-based target provider loading from YAML configs and Parquet datasets.

Implements standardized file loading with consistent path resolution and dataset concatenation. Ensures all datasets maintain identical sampling intervals and temporal alignment.

Directory structure expected by SimpleTargetProvider:

The provider expects a hierarchical directory structure: - Root directory contains shared data files and target definitions - Group subdirectories contain target-specific measurement and weather files - Path templates use {name} placeholder for target-specific file naming

Examples

Complete provider setup with data loading:

>>> from pathlib import Path
>>> from datetime import timedelta
>>> from openstef_beam.evaluation.metric_providers import RMAEProvider, RCRPSProvider
>>> provider = SimpleTargetProvider(
...     data_dir=Path("./benchmark_data"),
...     measurements_path_template="load_{name}.parquet",
...     weather_path_template="weather_{name}.parquet",
...     profiles_path="standard_profiles.parquet",
...     prices_path="energy_prices.parquet",
...     targets_file_path="energy_targets.yaml",
...     data_sample_interval=timedelta(minutes=15),
...     metrics=[RMAEProvider(), RCRPSProvider()],
...     use_profiles=True,
...     use_prices=True
... )
Parameters:

data (Any)

data_dir: Path#
measurements_path_template: str#
weather_path_template: str#
profiles_path: str#
prices_path: str#
targets_file_path: str#
use_profiles: bool#
use_prices: bool#
data_sample_interval: timedelta#
metrics: list[MetricProvider] | Callable[[TypeVar(T, bound= BenchmarkTarget)], list[MetricProvider]]#
property get_target_class: type[T]#

Returns the class type of the target.

get_targets(filter_args: F | None = None) list[T][source]#

Load all available benchmark targets.

Parameters:
  • filter_args (Optional[TypeVar(F)]) – Provider-specific filtering criteria.

  • filter_args

Returns:

Complete list of targets with validated time constraints and metadata.

Raises:
  • FileNotFoundError – When target data source is inaccessible.

  • ValidationError – When target definitions violate constraints.

Return type:

list[TypeVar(T, bound= BenchmarkTarget)]

get_metrics_for_target(target: T) list[MetricProvider][source]#

Returns the list of metrics to use for evaluation of a target.

Parameters:
  • target (TypeVar(T, bound= BenchmarkTarget)) – The target to get metrics for

  • target

Returns:

A list of metric providers to use for evaluating predictions for this target

Return type:

list[MetricProvider]

get_measurements_for_target(target: T) VersionedTimeSeriesDataset[source]#

Load ground truth measurements from target-specific Parquet file.

Returns:

The loaded measurements data.

Return type:

VersionedTimeSeriesDataset

Parameters:

target (TypeVar(T, bound= BenchmarkTarget))

Return type:

VersionedTimeSeriesDataset

get_predictors_for_target(target: T) VersionedTimeSeriesDataset[source]#

Combine weather, profiles, and prices into aligned predictor dataset.

Concatenates datasets feature-wise with inner join to ensure temporal alignment. Only includes datasets that are enabled via configuration flags.

Returns:

Combined predictor dataset with all enabled features.

Return type:

VersionedTimeSeriesMixin

Parameters:

target (TypeVar(T, bound= BenchmarkTarget))

Return type:

VersionedTimeSeriesDataset

get_weather_for_target(target: T) VersionedTimeSeriesDataset[source]#

Load weather features from target-specific Parquet file.

Returns:

The loaded weather data.

Return type:

VersionedTimeSeriesDataset

Parameters:

target (TypeVar(T, bound= BenchmarkTarget))

Return type:

VersionedTimeSeriesDataset

get_profiles() VersionedTimeSeriesDataset[source]#

Load shared energy profiles data from configured Parquet file.

Return type:

VersionedTimeSeriesDataset

Returns:

The loaded energy profiles data.

Return type:

VersionedTimeSeriesDataset

get_prices() VersionedTimeSeriesDataset[source]#

Load shared energy pricing data from configured Parquet file.

Return type:

VersionedTimeSeriesDataset

Returns:

The loaded energy pricing data.

Return type:

VersionedTimeSeriesDataset

get_evaluation_mask_for_target(target: T) DatetimeIndex | None[source]#

Get the evaluation mask for a target.

Parameters:
  • target (TypeVar(T, bound= BenchmarkTarget)) – The target to get the evaluation mask for

  • target

Returns:

A DatetimeIndex representing the evaluation mask, or None if no mask is defined

Return type:

DatetimeIndex | None

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': False, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].