Models#

OpenSTEF’s forecasting system is built from composable components. At the lowest level, a Forecaster wraps a single ML algorithm. Transforms handle feature engineering and postprocessing. A Model binds these together into a trainable unit. Higher-level Workflows add lifecycle management, and Presets provide opinionated defaults for production. You choose the level of abstraction that matches your use case.

Note

For how models are selected automatically via metalearning, see Metalearning. For how models are evaluated in backtesting, see BEAM.

Component Overview#

OpenSTEF’s model system is composed of five components, each building on the one below. You can enter at any level depending on how much control you need.

        graph LR
    subgraph Workflow["CustomForecastingWorkflow"]
        subgraph Model["ForecastingModel"]
            direction TB
            A[PreProcess Pipeline] --> B[Forecaster] --> C[PostProcess Pipeline]
        end
        D[Callbacks]
        F[Lifecycle Management]
    end

    P["Presets (Config + Factory)"] -.->|creates| Workflow

    classDef primary fill:#00D9C5,stroke:#1E3A5F,stroke-width:2px,color:#000
    classDef secondary fill:#1E3A5F,stroke:#00D9C5,stroke-width:2px,color:#fff
    classDef accent fill:#e6f7f5,stroke:#00D9C5,stroke-width:2px,color:#000
    classDef factory fill:#fff3cd,stroke:#856404,stroke-width:2px,color:#000

    class B secondary
    class A,C primary
    class D,F accent
    class P factory
    

Forecasters#

Forecasters are pure ML predictors. They wrap a specific algorithm (XGBoost, LightGBM, GBLinear, etc.), receive a preprocessed ForecastInputDataset, and return a ForecastDataset. No feature engineering or postprocessing happens inside a Forecaster - it is solely responsible for the mathematical prediction step.

All forecasters implement the Forecaster interface with fit() and predict() methods.

Transforms#

Transforms are standalone pre- and postprocessing steps - lag features, holiday indicators, datetime features, quantile sorting, and more. They compose into a TransformPipeline which applies them sequentially. Transforms are stateless or carry minimal fitted state (e.g., scalers).

Model#

The Model (ForecastingModel) binds a preprocessing TransformPipeline, a Forecaster, and a postprocessing TransformPipeline into a single saveable unit. This is the core trainable object:

model = ForecastingModel(
    preprocessing=preprocess_pipeline,
    forecaster=forecaster,
    postprocessing=postprocess_pipeline,
    target_column="load",
)

The model’s fit() and predict() methods accept raw TimeSeriesDataset objects and handle the full pipeline internally.

Workflow#

The Workflow (CustomForecastingWorkflow) wraps a Model and adds lifecycle management: callbacks for MLflow storage, model reuse logic, model selection, performance monitoring, experiment tagging, and run naming. This is where operational concerns live - the Model stays focused on prediction.

Presets#

For production use, create_forecasting_workflow() is an opinionated factory that constructs a fully-wired CustomForecastingWorkflow from a ForecastingWorkflowConfig. Presets cover the majority of production use cases with sensible defaults for preprocessing, feature engineering, and callbacks.

Warning

OpenSTEF is not opinionated by default - the full configuration surface is exposed at every level. Presets add opinions for convenience. For research or experimentation, use the raw Workflow API for full configurability.

Model Selection Guide#

All forecasters in OpenSTEF support quantile forecasting, producing probabilistic predictions at configurable quantiles. The exceptions are the Median and Base Case forecasters, which produce only a single quantile.

Forecaster Comparison#

Model

Strengths

Best For

Quantiles

Extrapolation

XGBoost

Non-linear pattern capture; robust

General-purpose

Multi

No

LightGBM

Fast training; low memory

General-purpose; large datasets

Multi

No

GBLinear

Extrapolates beyond training range

Congestion management

Multi

Yes

LGBM Linear

Non-linear splits + linear leaves

Partial extrapolation

Multi

Partial

Ensemble (via openstef-meta)

Complementary model combination

Best accuracy

Multi

Partial

Constant Quantile

No features needed at prediction

Fallback

Multi

N/A

Median

Robust; minimal assumptions

Stable loads; baseline

Single

N/A

Base Case

Zero cost; persistence

Baseline reference

Single

N/A

When to Use What#

General-purpose forecasting: Start with XGBoost or LightGBM. Both excel at capturing non-linear patterns in load data (weather interactions, time-of-day effects, calendar patterns). LightGBM trains faster on large datasets. For highest accuracy, use the Ensemble approach (openstef-meta) which combines multiple forecasters.

Congestion management: Use GBLinear. Tree-based models cannot predict values outside their training range - a critical limitation when forecasting peak loads that may exceed historical maxima. GBLinear’s linear structure allows natural extrapolation.

Best accuracy (production): Use the Ensemble approach (available via openstef-meta). Combining tree-based and linear forecasters exploits their complementary strengths - trees capture non-linear interactions while linear models provide extrapolation capability and stability.

Stable/predictable loads: The Median forecaster provides a robust baseline with minimal complexity. Useful for loads with very low variance or as a sanity-check reference.

Fallback/degraded mode: The Constant Quantile forecaster learns fixed quantile values per hour of day. It requires no input features at prediction time, making it suitable as a last-resort fallback when data pipelines fail.

Choosing Your Abstraction Level#

Use Case

Recommended Component

Why

Production deployment

Presets (create_forecasting_workflow())

Sensible defaults, MLflow integration, model reuse

Custom preprocessing research

ForecastingModel

Full control over transforms without lifecycle overhead

Novel algorithm development

Forecaster

Implement the interface, plug into any higher level

Operational monitoring

CustomForecastingWorkflow

Add callbacks without changing model logic

Most users start with Presets and only drop down to lower levels when they need custom behavior. The component boundaries are designed so you can replace one piece (e.g., swap a Forecaster or add a Transform) without rewriting the rest of the pipeline.