EnsembleForecastingWorkflowConfig#
- class openstef_meta.presets.EnsembleForecastingWorkflowConfig(**data: Any) None[source]#
Bases:
BaseConfigConfiguration for ensemble forecasting workflows.
- Parameters:
data (
Any)
- model_id: ModelIdentifier#
Unique identifier for the forecasting model.
- location: LocationConfig#
Location information for the forecasting workflow.
- xgboost_hyperparams: XGBoostHyperParams#
Hyperparameters for XGBoost forecaster.
- gblinear_hyperparams: GBLinearHyperParams#
Hyperparameters for GBLinear forecaster.
- lgbm_hyperparams: LGBMHyperParams#
Hyperparameters for LightGBM forecaster.
- lgbmlinear_hyperparams: LGBMLinearHyperParams#
Hyperparameters for LightGBM forecaster.
- combiner_lgbm_hyperparams: LGBMCombinerHyperParams#
Hyperparameters for LightGBM combiner.
- combiner_rf_hyperparams: RFCombinerHyperParams#
Hyperparameters for Random Forest combiner.
- combiner_xgboost_hyperparams: XGBCombinerHyperParams#
Hyperparameters for XGBoost combiner.
- combiner_logistic_hyperparams: LogisticCombinerHyperParams#
Hyperparameters for Logistic Regression combiner.
- combiner_stacking_lgbm_hyperparams: LGBMHyperParams#
Hyperparameters for LightGBM stacking combiner.
- combiner_stacking_gblinear_hyperparams: GBLinearHyperParams#
Hyperparameters for GBLinear stacking combiner.
- selected_features: FeatureSelection#
Feature selection for which features to include/exclude.
- cutoff_history: timedelta#
Amount of historical data to exclude from training and prediction due to incomplete features from lag-based preprocessing. When using lag transforms (e.g., lag-14), the first N days contain NaN values. Set this to match your maximum lag duration (e.g., timedelta(days=14)). Default of 0 assumes no invalid rows are created by preprocessing. Note: should be same as predict_history if you are using lags. We default to disabled to keep the same behaviour as openstef 3.0.
- completeness_threshold: float#
Minimum fraction of data that should be available for making a regular forecast.
- flatliner_threshold: timedelta#
Number of minutes that the load has to be constant to detect a flatliner.
- detect_non_zero_flatliner: bool#
If True, flatliners are also detected on non-zero values (median of the load).
- predict_nonzero_flatliner: bool#
If True, predict the median of load measurements instead of zero (only for flatliner model).
- shifters: list[Shifter]#
List of feature shifts to align aggregation intervals. Each Shifter can target different features with different aggregation periods.
- rolling_aggregate_features: list[AggregationFunction]#
If not None, rolling aggregate(s) of load will be used as features in the model.
- clip_features: FeatureSelection#
Feature selection for which features to clip to their learned range.
- nan_on_outlier_features: FeatureSelection#
Feature selection for which features to replace out-of-range values with NaN. Defaults to no features (disabled).
- max_day_lags: int#
Maximum number of days to look back for day-based lags. Default is 14 days (two weekly cycles). Set to 7 for a single weekly cycle.
- forecaster_sample_weights: dict[str, SampleWeightConfig]#
Per-forecaster sample weighting configuration. Use weight_exponent=0 to produce uniform weights.
- combiner_sample_weight: SampleWeightConfig#
Sample weighting configuration for the forecast combiner. Defaults to weight_exponent=0 (uniform weights).
- data_splitter: DataSplitter#
Configuration for splitting data into training, validation, and test sets.
- mlflow_storage: MLFlowStorage | None#
Configuration for MLflow experiment tracking and model storage.
- model_selection_metric: tuple[Quantile | Literal['global'], str, MetricDirection]#
Metric to monitor for model performance when retraining.
- model_selection_old_model_penalty: float#
Penalty to apply to the old model’s metric to bias selection towards newer models.
- model_performance_callback_enabled: bool#
Whether to enable the ModelPerformanceCallback that evaluates model performance at the end of fitting.