ForecastingModel#

class openstef_models.models.ForecastingModel(**data: Any) None[source]#

Bases: BaseModel, Predictor[TimeSeriesDataset, ForecastDataset]

Complete forecasting pipeline combining preprocessing, prediction, and postprocessing.

Orchestrates the full forecasting workflow by managing feature engineering, model training/prediction, and result postprocessing. Automatically handles the differences between single-horizon and multi-horizon forecasters while ensuring data consistency and validation throughout the pipeline.

Invariants

  • fit() must be called before predict()

  • Forecaster and preprocessing horizons must match during initialization

Important

The cutoff_history parameter is crucial when using lag-based features in preprocessing. For example, a lag-14 transformation creates NaN values for the first 14 days of data. Set cutoff_history to exclude these incomplete rows from training. You must configure this manually based on your preprocessing pipeline since lags cannot be automatically inferred from the transforms.

Example

Basic forecasting workflow:

>>> from openstef_models.models.forecasting.constant_median_forecaster import (
...     ConstantMedianForecaster, ConstantMedianForecasterConfig
... )
>>> from openstef_core.types import LeadTime
>>>
>>> # Note: This is a conceptual example showing the API structure
>>> # Real usage requires implemented forecaster classes
>>> forecaster = ConstantMedianForecaster(
...     config=ConstantMedianForecasterConfig(horizons=[LeadTime.from_string("PT36H")])
... )
>>> # Create and train model
>>> model = ForecastingModel(
...     forecaster=forecaster,
...     cutoff_history=timedelta(days=14),  # Match your maximum lag in preprocessing
... )
>>> model.fit(training_data)
>>>
>>> # Generate forecasts
>>> forecasts = model.predict(new_data)
Parameters:

data (Any)

preprocessing: TransformPipeline[TimeSeriesDataset]#
forecaster: Forecaster#
postprocessing: TransformPipeline[ForecastDataset]#
target_column: str#
data_splitter: DataSplitter#
cutoff_history: timedelta#
evaluation_metrics: list[MetricProvider]#
tags: dict[str, str]#
property config: ForecasterConfig#

Returns the configuration of the underlying forecaster.

property is_fitted: bool#

Check if the predictor has been fitted.

fit(data: TimeSeriesDataset, data_val: TimeSeriesDataset | None = None, data_test: TimeSeriesDataset | None = None) ModelFitResult[source]#

Train the forecasting model on the provided dataset.

Fits the preprocessing pipeline and underlying forecaster. Handles both single-horizon and multi-horizon forecasters appropriately.

The data splitting follows this sequence: 1. Split test set from full data (using test_splitter) 2. Split validation from remaining train+val data (using val_splitter) 3. Train on the final training set

Parameters:
  • data (TimeSeriesDataset) – Historical time series data with features and target values.

  • data_val (TimeSeriesDataset | None) – Optional validation data. If provided, splitters are ignored for validation.

  • data_test (TimeSeriesDataset | None) – Optional test data. If provided, splitters are ignored for test.

  • data

  • data_val

  • data_test

Returns:

FitResult containing training details and metrics.

Return type:

ModelFitResult

predict(data: TimeSeriesDataset, forecast_start: datetime | None = None) ForecastDataset[source]#

Generate forecasts using the trained model.

Transforms input data through the preprocessing pipeline, generates predictions using the underlying forecaster, and applies postprocessing transformations.

Parameters:
  • data (TimeSeriesDataset) – Input time series data for generating forecasts.

  • forecast_start (datetime | None) – Starting time for forecasts. If None, uses data end time.

  • data

  • forecast_start

Returns:

Processed forecast dataset with predictions and uncertainty estimates.

Raises:

NotFittedError – If the model hasn’t been trained yet.

Return type:

ForecastDataset

prepare_input(data: TimeSeriesDataset, forecast_start: datetime | None = None) ForecastInputDataset[source]#

Prepare input data for forecasting by applying preprocessing and filtering.

Transforms raw time series data through the preprocessing pipeline, restores the target column, and filters out incomplete historical data to ensure training quality.

Parameters:
  • data (TimeSeriesDataset) – Raw time series dataset to prepare for forecasting.

  • forecast_start (datetime | None) – Optional start time for forecasts. If provided and earlier than the cutoff time, overrides the cutoff for data filtering.

  • data

  • forecast_start

Returns:

Processed forecast input dataset ready for model prediction.

Return type:

ForecastInputDataset

score(data: TimeSeriesDataset) SubsetMetric[source]#

Evaluate model performance on the provided dataset.

Generates predictions for the dataset and calculates evaluation metrics by comparing against ground truth values. Uses the configured evaluation metrics to assess forecast quality at the maximum forecast horizon.

Parameters:
  • data (TimeSeriesDataset) – Time series dataset containing both features and target values for evaluation.

  • data

Returns:

Evaluation metrics including configured providers (e.g., R2, observed probability) computed at the maximum forecast horizon.

Return type:

SubsetMetric

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'protected_namespaces': (), 'ser_json_inf_nan': 'null'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None#

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:
  • self (BaseModel) – The BaseModel instance.

  • context (Any) – The context.

  • self

  • context

Return type:

None