MetricProvider#

class openstef_beam.evaluation.metric_providers.MetricProvider(**data: Any) None[source]

Bases: BaseConfig

Base class for forecast metric computation.

Provides a standardized interface for computing performance metrics on probabilistic forecasts. Handles processing across multiple quantiles and allows filtering to specific quantiles of interest.

Subclasses implement compute_deterministic() to provide specific metric calculations for individual quantiles. The base class handles the iteration and organization.

Example

Creating a custom metric provider

>>> from openstef_beam.evaluation.metric_providers import MetricProvider
>>> from openstef_core.types import Quantile
>>> import numpy as np
>>>
>>> class CustomMaeProvider(MetricProvider):
...     def compute_deterministic(self, y_true, y_pred, quantile):
...         return {"mae": float(np.mean(np.abs(y_true - y_pred)))}
>>>
>>> # Use with specific quantiles only
>>> provider = CustomMaeProvider(quantiles=[Quantile(0.1), Quantile(0.9)])
>>>
>>> # Or process all available quantiles
>>> provider_all = CustomMaeProvider()
Implementation guide:

Subclasses should override compute_deterministic() to return a dictionary mapping metric names to computed values for a single quantile.

For global metrics that don’t depend on individual quantiles, override compute_probabilistic() instead to process all quantiles together.

Parameters:

data (Any)

quantiles: list[Quantile] | None
__call__(subset: ForecastDataset) dict[Quantile | Literal['global'], dict[str, Annotated[float, BeforeValidator(func=_convert_none_to_nan, json_schema_input_type=PydanticUndefined)]]][source]

Process an evaluation subset and return metrics.

Extracts predictions and ground truth from the subset, then computes metrics for all relevant quantiles.

Parameters:
  • subset (ForecastDataset) – Evaluation subset containing predictions and ground truth data.

  • subset

Returns:

QuantileMetricsDict mapping quantile keys to computed metric values.

Return type:

dict[Union[Quantile, Literal['global']], dict[str, float]]

compute_probabilistic(y_true: ndarray[tuple[Any, ...], dtype[floating]], y_pred: ndarray[tuple[Any, ...], dtype[floating]], quantiles: ndarray[tuple[Any, ...], dtype[floating]]) dict[Quantile | Literal['global'], dict[str, Annotated[float, BeforeValidator(func=_convert_none_to_nan, json_schema_input_type=PydanticUndefined)]]][source]

Compute probabilistic metrics computed on multiple quantile data.

Default behaviour is to call compute_deterministic for each quantile and returns the metrics prefixed by the quantile value.

Parameters:
Returns:

QuantileMetricsDict mapping quantile-prefixed metric names to computed values.

Return type:

dict[Union[Quantile, Literal['global']], dict[str, float]]

property metric_names: frozenset[str]

Declared metric names that this provider produces.

Override in subclasses to enable eager metric-name validation (e.g. in the hyperparameter tuner).

compute_deterministic(y_true: ndarray[tuple[Any, ...], dtype[floating]], y_pred: ndarray[tuple[Any, ...], dtype[floating]], quantile: float) dict[str, Annotated[float, BeforeValidator(func=_convert_none_to_nan, json_schema_input_type=PydanticUndefined)]][source]

Compute metrics for a single quantile prediction.

Must be implemented by subclasses that provide deterministic metrics (per quantile).

Parameters:
Return type:

dict[str, float]

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': False, 'extra': 'ignore', 'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].