MetricProvider#
- class openstef_beam.evaluation.metric_providers.MetricProvider(**data: Any) None[source]
Bases:
BaseConfigBase class for forecast metric computation.
Provides a standardized interface for computing performance metrics on probabilistic forecasts. Handles processing across multiple quantiles and allows filtering to specific quantiles of interest.
Subclasses implement compute_deterministic() to provide specific metric calculations for individual quantiles. The base class handles the iteration and organization.
Example
Creating a custom metric provider
>>> from openstef_beam.evaluation.metric_providers import MetricProvider >>> from openstef_core.types import Quantile >>> import numpy as np >>> >>> class CustomMaeProvider(MetricProvider): ... def compute_deterministic(self, y_true, y_pred, quantile): ... return {"mae": float(np.mean(np.abs(y_true - y_pred)))} >>> >>> # Use with specific quantiles only >>> provider = CustomMaeProvider(quantiles=[Quantile(0.1), Quantile(0.9)]) >>> >>> # Or process all available quantiles >>> provider_all = CustomMaeProvider()
- Implementation guide:
Subclasses should override compute_deterministic() to return a dictionary mapping metric names to computed values for a single quantile.
For global metrics that don’t depend on individual quantiles, override compute_probabilistic() instead to process all quantiles together.
- Parameters:
data (
Any)
- __call__(subset: ForecastDataset) dict[Quantile | Literal['global'], dict[str, Annotated[float, BeforeValidator(func=_convert_none_to_nan, json_schema_input_type=PydanticUndefined)]]][source]
Process an evaluation subset and return metrics.
Extracts predictions and ground truth from the subset, then computes metrics for all relevant quantiles.
- compute_probabilistic(y_true: ndarray[tuple[Any, ...], dtype[floating]], y_pred: ndarray[tuple[Any, ...], dtype[floating]], quantiles: ndarray[tuple[Any, ...], dtype[floating]]) dict[Quantile | Literal['global'], dict[str, Annotated[float, BeforeValidator(func=_convert_none_to_nan, json_schema_input_type=PydanticUndefined)]]][source]
Compute probabilistic metrics computed on multiple quantile data.
Default behaviour is to call compute_deterministic for each quantile and returns the metrics prefixed by the quantile value.
- Parameters:
y_true (
ndarray[tuple[Any,...],dtype[floating]]) – True values, 1D array of shape (num_samples,).y_pred (
ndarray[tuple[Any,...],dtype[floating]]) – Predicted values, 2D array of shape (num_samples, num_quantiles).quantiles (
ndarray[tuple[Any,...],dtype[floating]]) – Quantiles used for prediction, 1D array of shape (num_quantiles,).y_true
y_pred
quantiles
- Returns:
QuantileMetricsDict mapping quantile-prefixed metric names to computed values.
- Return type:
- property metric_names: frozenset[str]
Declared metric names that this provider produces.
Override in subclasses to enable eager metric-name validation (e.g. in the hyperparameter tuner).
- compute_deterministic(y_true: ndarray[tuple[Any, ...], dtype[floating]], y_pred: ndarray[tuple[Any, ...], dtype[floating]], quantile: float) dict[str, Annotated[float, BeforeValidator(func=_convert_none_to_nan, json_schema_input_type=PydanticUndefined)]][source]
Compute metrics for a single quantile prediction.
Must be implemented by subclasses that provide deterministic metrics (per quantile).
- Parameters:
- Return type:
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': False, 'extra': 'ignore', 'protected_namespaces': ()}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].