EnsembleForecastingWorkflowConfig#

class openstef_meta.presets.EnsembleForecastingWorkflowConfig(**data: Any) None[source]#

Bases: BaseConfig

Configuration for ensemble forecasting workflows.

Parameters:

data (Any)

kind: Literal['ensemble']#

Discriminator tag for config type.

model_id: TypeAliasType#

Unique identifier for the forecasting model.

run_name: str | None#

Optional name for this workflow run, can be used for versioning.

ensemble_type: Literal['learned_weights', 'stacking', 'rules']#
base_models: Sequence[Literal['lgbm', 'gblinear', 'xgboost', 'lgbm_linear']]#
combiner_model: Literal['lgbm', 'rf', 'xgboost', 'logistic', 'gblinear']#
quantiles: list[Quantile]#

List of quantiles to predict for probabilistic forecasting.

sample_interval: timedelta#

Time interval between consecutive data samples.

horizons: list[LeadTime]#

List of forecast horizons to predict.

location: LocationConfig#

Location information for the forecasting workflow.

xgboost_hyperparams: XGBoostHyperParams#

Hyperparameters for XGBoost forecaster.

gblinear_hyperparams: GBLinearHyperParams#

Hyperparameters for GBLinear forecaster.

lgbm_hyperparams: LGBMHyperParams#

Hyperparameters for LightGBM forecaster.

lgbmlinear_hyperparams: LGBMLinearHyperParams#

Hyperparameters for LightGBM forecaster.

combiner_lgbm_hyperparams: LGBMCombinerHyperParams#

Hyperparameters for LightGBM combiner.

combiner_rf_hyperparams: RFCombinerHyperParams#

Hyperparameters for Random Forest combiner.

combiner_xgboost_hyperparams: XGBCombinerHyperParams#

Hyperparameters for XGBoost combiner.

combiner_logistic_hyperparams: LogisticCombinerHyperParams#

Hyperparameters for Logistic Regression combiner.

combiner_stacking_lgbm_hyperparams: LGBMHyperParams#

Hyperparameters for LightGBM stacking combiner.

combiner_stacking_gblinear_hyperparams: GBLinearHyperParams#

Hyperparameters for GBLinear stacking combiner.

target_column: str#

Name of the target variable column in datasets.

energy_price_column: str#

Name of the energy price column in datasets.

radiation_column: str#

Name of the radiation column in datasets.

wind_speed_column: str#

Name of the wind speed column in datasets.

pressure_column: str#

Name of the pressure column in datasets.

temperature_column: str#

Name of the temperature column in datasets.

relative_humidity_column: str#

Name of the relative humidity column in datasets.

selected_features: FeatureSelection#

Feature selection for which features to include/exclude.

predict_history: timedelta#

Amount of historical data available at prediction time.

cutoff_history: timedelta#

Amount of historical data to exclude from training and prediction due to incomplete features from lag-based preprocessing. When using lag transforms (e.g., lag-14), the first N days contain NaN values. Set this to match your maximum lag duration (e.g., timedelta(days=14)). Default of 0 assumes no invalid rows are created by preprocessing. Note: should be same as predict_history if you are using lags. We default to disabled to keep the same behaviour as openstef 3.0.

completeness_threshold: float#

Minimum fraction of data that should be available for making a regular forecast.

flatliner_threshold: timedelta#

Number of minutes that the load has to be constant to detect a flatliner.

detect_non_zero_flatliner: bool#

If True, flatliners are also detected on non-zero values (median of the load).

predict_nonzero_flatliner: bool#

If True, predict the median of load measurements instead of zero (only for flatliner model).

shifters: list[Shifter]#

List of feature shifts to align aggregation intervals. Each Shifter can target different features with different aggregation periods.

rolling_aggregate_features: list[TypeAliasType]#

If not None, rolling aggregate(s) of load will be used as features in the model.

clip_features: FeatureSelection#

Feature selection for which features to clip to their learned range.

nan_on_outlier_features: FeatureSelection#

Feature selection for which features to replace out-of-range values with NaN. Defaults to no features (disabled).

max_day_lags: int#

Maximum number of days to look back for day-based lags. Default is 14 days (two weekly cycles). Set to 7 for a single weekly cycle.

forecaster_sample_weights: dict[str, SampleWeightConfig]#

Per-forecaster sample weighting configuration. Use weight_exponent=0 to produce uniform weights.

combiner_sample_weight: SampleWeightConfig#

Sample weighting configuration for the forecast combiner. Defaults to weight_exponent=0 (uniform weights).

data_splitter: DataSplitter#

Configuration for splitting data into training, validation, and test sets.

evaluation_metrics: list[MetricProvider]#

List of metric providers for evaluating model score.

mlflow_storage: MLFlowStorage | None#

Configuration for MLflow experiment tracking and model storage.

model_reuse_enable: bool#

Whether to enable reuse of previously trained models.

model_reuse_max_age: timedelta#

Maximum age of a model to be considered for reuse.

model_selection_enable: bool#

Whether to enable automatic model selection based on performance.

model_selection_metric: tuple[Union[Quantile, Literal['global']], str, TypeAliasType]#

Metric to monitor for model performance when retraining.

model_selection_old_model_penalty: float#

Penalty to apply to the old model’s metric to bias selection towards newer models.

model_performance_callback_enabled: bool#

Whether to enable the ModelPerformanceCallback that evaluates model performance at the end of fitting.

model_performance_callback_metric_threshold: tuple[Union[Quantile, Literal['global']], str, TypeAliasType, float]#

Metric to monitor for model performance threshold at the end of fitting.

verbosity: Literal[0, 1, 2, 3, True]#

Verbosity level. 0=silent, 1=warning, 2=info, 3=debug

tags: dict[str, str]#

Optional metadata tags for the model.

experiment_tags: dict[str, str]#

Optional metadata tags for experiment tracking.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': False, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].