openstef.model.regressors package¶

Submodules¶

openstef.model.regressors.arima module¶

This module contains the SARIMAX regressor wrapper around statsmodels implementation.

class openstef.model.regressors.arima.ARIMAOpenstfRegressor(backtest_max_horizon=1440, order=(0, 0, 0), seasonal_order=(0, 0, 0, 0), trend=None)¶

Bases: OpenstfRegressor

Wrapper around statmodels implementation of (S)ARIMA(X) model.

The fit of an ARIMA statsmodels produces a result object which is used to perform the various computations around forecasting. (see https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima.model.ARIMAResults.html)

To make a prediction, it needs to update the result object’s historic data, ie the past values of the target/endogenous data and the features/exogenous data, applying the fitted parameters to these new data unrelated to the original training data. This update can be performed by the method update_historic_data.

In the following code, we use interchangeably the statmodels and scikit-learn terminology for the variables:

the features ‘x’ is equivalent to the exogenous data: ‘exog’ for short.
the target ‘y’ is equivalent to the endogenous data: ‘endog’ for short.

More information here https://www.statsmodels.org/stable/endog_exog.html.

property can_predict_quantiles¶: Indicates wether this model can make quantile predictions.

property feature_names¶: The names of he features used to train the model.

fit(x, y, **kwargs)¶

Fits the regressor.

Parameters:

x – Feature matrix
y – Labels
kwargs – model-specific keywords

Returns:

Fitted model

get_feature_importance()¶

Because report needs ‘weight’ and ‘gain’ as importance metrics, we set the values to these names.

‘weight’ is corresponding to the coefficients values
‘gain’ is corresponding to the pvalue for the nullity test of each coefficient

predict(x, quantile=0.5, **kwargs)¶

Makes a prediction. Only available after the model has been trained.

Parameters:

x – Feature matrix
kwargs – model-specific keywords

Returns:

Prediction

predict_quantile(start, end, exog, quantile)¶

Quantile prediction.

It relies on the parameters’ confidence intervals.

Parameters:

start (int, str, or datetime, optional) – Zero-indexed observation number at which to start forecasting, i.e., the first forecast is start. Can also be a date string to parse or a datetime type. Default is the the zeroth observation.
end (int, str, or datetime, optional) – Zero-indexed observation number at which to end forecasting, i.e., the last forecast is end. Can also be a date string to parse or a datetime type. However, if the dates index does not have a fixed frequency, end must be an integer index if you want out of sample prediction. Default is the last observation in the sample.
exog (pd.DataFrame) – Exogenous data (features).
quantile (float) – The quantile for the confidence interval.

Returns:

The quantile prediction.

Return type:

pd.Serie

score(x, y)¶

Compute R2 score with backtesting strategy.

The backtest is performed by the Time Series cross-validator of scikit-learn which returns first k folds as train set and the (k+1)th fold as test set in the kth split. (see https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html)

It needs to update the historic data with (x_past, y_past) for each split.

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → ARIMAOpenstfRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, quantile: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') → ARIMAOpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

quantile (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for quantile parameter in predict.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, x: bool | None | str = '$UNCHANGED$') → ARIMAOpenstfRegressor¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in score.
Returns:: self – The updated object.
Return type:: object

update_historic_data(x_past, y_past)¶

Apply the fitted parameters to new data unrelated to the original training data. It’s a side-effect.

Creates a new result object using the current fitted parameters, applied to a completely new dataset that is assumed to be unrelated to the model’s original data. The new results can then be used for analysis or forecasting. It should be used before forecasting, to wedge the historic data just before the first forecast timestamp, with:

New observations from the modeled time-series process.

New observations of exogenous regressors.

Parameters:

x_past (pd.DataFrame) – The exogenous (features) data.
y_past (pd.DataFrame) – The endogenous (target) data.

openstef.model.regressors.custom_regressor module¶

This module defines the custom regressor.

class openstef.model.regressors.custom_regressor.CustomOpenstfRegressor¶

Bases: OpenstfRegressor

A custom regressor allows to load any custom model that is not included with openSTEF.

abstract static objective()¶

Return type:: Type[RegressorObjective]

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → CustomOpenstfRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') → CustomOpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.
Returns:: self – The updated object.
Return type:: object

abstract static valid_kwargs()¶

Return type:: list[str]

openstef.model.regressors.custom_regressor.create_custom_objective(custom_model_path)¶

openstef.model.regressors.custom_regressor.is_custom_type(model_type)¶

openstef.model.regressors.custom_regressor.load_custom_model(custom_model_path)¶

Load the external custom model.

Return type:: CustomOpenstfRegressor

openstef.model.regressors.dazls module¶

This module defines the DAZL model.

class openstef.model.regressors.dazls.Dazls¶

Bases: BaseEstimator

DAZLS model.

The model carries out wind and solar power prediction for unseen target substations using training data from other substations with known components.

fit(features, target)¶

Fit the model.

In this function we scale the input of the domain and adaptation models of the DAZLS MODEL. Then we fit the two models. We separate the features into domain_model_input, adaptation_model_input and target, and we use them for the fitting and the training of the models.

Parameters:

features – inputs for domain and adaptation model (domain_model_input, adaptation_model_input)
target – the expected output (y_train)

model_: Pipeline¶

predict(x)¶

Make a prediction.

For the prediction we use the test data x. We use domain_model_input_columns and adaptation_model_input_columns to separate x in test data for domain model and adaptation model respectively.

There is an option available to return the domain model and adaptation model predictions separately to more easily investigate the effectiveness of the models.

Parameters:

x (array) – domain_model_test_data, adaptation_model_test_data
return_sub_preds – a flag value indicating to return the predictions of the domain model and adaptation model separately. (Default: False.)

Returns:

The output prediction after both models.

Return type:

prediction

score(truth, prediction)¶

Evaluation of the prediction’s output.

Parameters:

truth – real values
prediction – predicted values

Returns:

RMSE and R2 scores

set_fit_request(*, features: bool | None | str = '$UNCHANGED$', target: bool | None | str = '$UNCHANGED$') → Dazls¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

features (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for features parameter in fit.
target (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for target parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') → Dazls¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.
Returns:: self – The updated object.
Return type:: object

set_score_request(*, prediction: bool | None | str = '$UNCHANGED$', truth: bool | None | str = '$UNCHANGED$') → Dazls¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

prediction (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for prediction parameter in score.
truth (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for truth parameter in score.

Returns:

self – The updated object.

Return type:

object

openstef.model.regressors.flatliner module¶

class openstef.model.regressors.flatliner.FlatlinerRegressor(quantiles=None)¶

Bases: OpenstfRegressor, RegressorMixin

property can_predict_quantiles: bool¶

Attribute that indicates if the model predict particular quantiles.

Return type:: bool

property feature_names: list¶

The names of the features used to train the model.

Return type:: list

feature_names_: List[str] = []¶

fit(x, y, **kwargs)¶

Fits flatliner model.

Parameters:

x (DataFrame) – Feature matrix
y (Series) – Labels

Return type:

RegressorMixin

Returns:

Fitted LinearQuantile model

predict(x, quantile=0.5, **kwargs)¶

Makes a prediction for a desired quantile.

Parameters:

x (DataFrame) – Feature matrix
quantile (float) – Quantile for which a prediciton is desired, note that only quantile are available for which a model is trained, and that this is a quantile-model specific keyword

Return type:

array

Returns:

Prediction

Raises:

ValueError in case no model is trained for the requested quantile –

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → FlatlinerRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, quantile: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') → FlatlinerRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

quantile (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for quantile parameter in predict.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

openstef.model.regressors.gblinear_quantile module¶

class openstef.model.regressors.gblinear_quantile.GBLinearQuantileOpenstfRegressor(quantiles=(0.9, 0.5, 0.1), missing_values=nan, imputation_strategy='mean', fill_value=None, weight_scale_percentile=95, weight_exponent=1, weight_floor=0.1, validation_fraction=0.2, no_fill_future_values_features=None, clipped_features=None, learning_rate=0.15, num_boost_round=500, early_stopping_rounds=10, reg_alpha=0.0001, reg_lambda=0.1, updater='shotgun', feature_selector='shuffle', top_k=0)¶

Bases: OpenstfRegressor

TO_IGNORE_FEATURES: List[str] = ['Month', 'Quarter']¶

TO_KEEP_FEATURES: List[str] = ['T-7d']¶

property can_predict_quantiles: bool¶

Attribute that indicates if the model predict particular quantiles.

Return type:: bool

property feature_names: list¶

The names of the features used to train the model.

Return type:: list

fit(x, y, **kwargs)¶

Fits the regressor.

Parameters:

x (DataFrame) – Feature matrix
y (Series) – Labels
kwargs – model-specific keywords

Return type:

OpenstfRegressor

Returns:

Fitted model

is_fitted_: bool = False¶

predict(x, quantile=0.5, **kwargs)¶

Makes a prediction. Only available after the model has been trained.

Parameters:

x (DataFrame) – Feature matrix
kwargs – model-specific keywords

Return type:

array

Returns:

Prediction

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → GBLinearQuantileOpenstfRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, quantile: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') → GBLinearQuantileOpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

quantile (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for quantile parameter in predict.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

openstef.model.regressors.lgbm module¶

class openstef.model.regressors.lgbm.LGBMOpenstfRegressor(*, boosting_type='gbdt', num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=100, subsample_for_bin=200000, objective=None, class_weight=None, min_split_gain=0.0, min_child_weight=0.001, min_child_samples=20, subsample=1.0, subsample_freq=0, colsample_bytree=1.0, reg_alpha=0.0, reg_lambda=0.0, random_state=None, n_jobs=None, importance_type='split', **kwargs)¶

Bases: LGBMRegressor, OpenstfRegressor

LGBM Regressor which implements the Openstf regressor API.

property can_predict_quantiles¶

Attribute that indicates if the model predict particular quantiles.

e.g. XGBQuantileOpenstfRegressor

property feature_names¶

Retrieve the model input feature names.

Returns:: The list of feature names

gain_importance_name = 'gain'¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

callbacks (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for callbacks parameter in fit.
categorical_feature (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for categorical_feature parameter in fit.
eval_init_score (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for eval_init_score parameter in fit.
eval_metric (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for eval_metric parameter in fit.
eval_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for eval_names parameter in fit.
eval_sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for eval_sample_weight parameter in fit.
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for eval_set parameter in fit.
feature_name (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for feature_name parameter in fit.
init_model (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for init_model parameter in fit.
init_score (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for init_score parameter in fit.
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

num_iteration (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for num_iteration parameter in predict.
pred_contrib (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for pred_contrib parameter in predict.
pred_leaf (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for pred_leaf parameter in predict.
raw_score (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for raw_score parameter in predict.
start_iteration (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for start_iteration parameter in predict.
validate_features (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for validate_features parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → LGBMOpenstfRegressor¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

weight_importance_name = 'split'¶

openstef.model.regressors.linear module¶

This module contains the linear regressor.

class openstef.model.regressors.linear.LinearOpenstfRegressor(missing_values=nan, imputation_strategy=None, fill_value=0)¶

Bases: LinearRegressor, OpenstfRegressor

Linear Regressor which implements the Openstf regressor API.

property can_predict_quantiles¶: Indicates wether this model can make quantile predictions.

property feature_names¶: The names of he features used to train the model.

fit(x, y, **kwargs)¶: Fit model.

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → LinearOpenstfRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') → LinearOpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.
Returns:: self – The updated object.
Return type:: object

class openstef.model.regressors.linear.LinearRegressor(missing_values=nan, imputation_strategy=None, fill_value=0)¶

Bases: MissingValuesHandler

Linear Regressor wrapped in the metamodel MissingValuesHandler.

This regressor can handle missing values by imputation strategy.

Parameters:

missing_values – int, float, str, np.nan or None, default=np.nan The placeholder for the missing values. All occurrences of missing_values will be imputed. For pandas’ dataframes with nullable integer dtypes with missing values, missing_values should be set to np.nan, since pd.NA will be converted to np.nan.
imputation_strategy – str, default=None The imputation strategy. - If None no imputation is performed. - If “mean”, then replace missing values using the mean along each column. Can only be used with numeric data. - If “median”, then replace missing values using the median along each column. Can only be used with numeric data. - If “most_frequent”, then replace missing using the most frequent value along each column. Can be used with strings or numeric data. If there is more than one such value, only the smallest is returned. - If “constant”, then replace missing values with fill_value. Can be used with strings or numeric data.
fill_value – str or numerical value, default=None When strategy == “constant”, fill_value is used to replace all occurrences of missing_values. If left to the default, fill_value will be 0 when imputing numerical data and “missing_value” for strings or object data types.

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → LinearRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') → LinearRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.
Returns:: self – The updated object.
Return type:: object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → LinearRegressor¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

openstef.model.regressors.linear_quantile module¶

class openstef.model.regressors.linear_quantile.LinearQuantileOpenstfRegressor(quantiles=(0.9, 0.5, 0.1), alpha=0.0, solver='highs', missing_values=nan, imputation_strategy='mean', fill_value=None, weight_scale_percentile=95, weight_exponent=1, weight_floor=0.1, no_fill_future_values_features=None, clipped_features=None)¶

Bases: OpenstfRegressor, RegressorMixin

FEATURE_IGNORE_LIST: Set[str] = {'IsSunday', 'IsWeekDay', 'IsWeekendDay', 'Month', 'Quarter'}¶

alpha: float¶

property can_predict_quantiles: bool¶

Attribute that indicates if the model predict particular quantiles.

Return type:: bool

feature_clipper_: FeatureClipper¶

property feature_names: list¶

The names of the features used to train the model.

Return type:: list

fit(x, y, **kwargs)¶

Fits linear quantile model.

Parameters:

x (DataFrame) – Feature matrix
y (Series) – Labels

Return type:

RegressorMixin

Returns:

Fitted LinearQuantile model

imputer_: MissingValuesTransformer¶

is_fitted_: bool = False¶

models_: Dict[float, QuantileRegressor]¶

predict(x, quantile=0.5, **kwargs)¶

Makes a prediction for a desired quantile.

Parameters:

x (DataFrame) – Feature matrix
quantile (float) – Quantile for which a prediciton is desired, note that only quantile are available for which a model is trained, and that this is a quantile-model specific keyword

Return type:

array

Returns:

Prediction

Raises:

ValueError in case no model is trained for the requested quantile –

quantiles: tuple[float, ...]¶

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → LinearQuantileOpenstfRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, quantile: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') → LinearQuantileOpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

quantile (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for quantile parameter in predict.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

solver: str¶

x_scaler_: StandardScaler¶

y_scaler_: StandardScaler¶

openstef.model.regressors.median module¶

This module contains the median regressor.

class openstef.model.regressors.median.MedianRegressor¶

Bases: OpenstfRegressor, RegressorMixin

Median regressor implementing the OpenSTEF regressor API. Note that this is a autoregressive model, meaning that it uses the previous predictions to predict the next value.

This regressor is good for predicting two types of signals: - Signals with very slow dynamics compared to the sampling rate, possibly

with a lot of noise.

Signals that switch between two or more states, which random in nature or

depend on unknown features, but tend to be stable in each state. An example of this may be waste heat delivered from an industrial process. Using a median over the last few timesteps adds some hysterisis to avoid triggering on noise.

Tips for using this regressor: - Set the lags to be evenly spaced and at a frequency mathching the

frequency of the input data. For example, if the input data is at 15 minute intervals, set the lags to be at 15 minute intervals as well.

Use a small training dataset, since there are no actual parameters to train.
Set the frequency of the input data index to avoid inferring it. Inference might be

a problem if we get very small chunks of data in training or validation sets. - Use only one training horizon, since the regressor will use the same lags for all

training horizons.

Allow for missing data by setting completeness_threshold to 0. If the prediction horizon is larger than the context window there will be a lot of nans in the input data, but the autoregression solves that.

property can_predict_quantiles: bool¶

Attribute that indicates if the model predict particular quantiles.

e.g. XGBQuantileOpenstfRegressor

Return type:: bool

property feature_names: list¶

Retrieve the model input feature names.

Return type:: list
Returns:: The list of feature names

fit(x, y, **kwargs)¶

This model does not have any hyperparameters to fit, but it does need to know the feature names of the lag features and the order of these.

Lag features are expected to be evently spaced and match the frequency of the input data. The lag features are expected to be named in the format T-<lag_in_minutes> or T-<lag_in_days>d. For example, T-1min, T-2min, T-3min or T-1d, T-2d.

Which lag features are used is determined by the feature engineering step.

Return type:: RegressorMixin

property frequency: int¶

Retrieve the model input frequency.

Return type:: int
Returns:: The frequency of the model input

predict(x, **kwargs)¶

Predict the median of the lag features for each time step in the context window.

Parameters:: x (pd.DataFrame) – The input data for prediction. This should be a pandas dataframe with lag features.
Returns:: The predicted median for each time step in the context window. If any lag feature is NaN, this will be ignored. If all lag features are NaN, the regressor will return NaN.
Return type:: np.array

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → MedianRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') → MedianRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.
Returns:: self – The updated object.
Return type:: object

openstef.model.regressors.regressor module¶

class openstef.model.regressors.regressor.OpenstfRegressor¶

Bases: BaseEstimator

This class defines the interface to which all ML models within OpenSTEF should adhere.

Required methods are indicated by abstractmethods, for which concrete implementations of ML models should have a definition. Common functionality which is required for the automated pipelines in OpenSTEF is defined in this class.

abstract property can_predict_quantiles: bool¶

Attribute that indicates if the model predict particular quantiles.

e.g. XGBQuantileOpenstfRegressor

Return type:: bool

abstract property feature_names: list¶

Retrieve the model input feature names.

Return type:: list
Returns:: The list of feature names

abstract fit(x, y, **kwargs)¶

Fits the regressor.

Parameters:

x (DataFrame) – Feature matrix
y (DataFrame) – Labels
kwargs – model-specific keywords

Return type:

RegressorMixin

Returns:

Fitted model

get_feature_importance()¶

Get feature importance.

Return type:: Optional[DataFrame]
Returns:: DataFrame with feature importance.

abstract predict(x, **kwargs)¶

Makes a prediction. Only available after the model has been trained.

Parameters:

x (DataFrame) – Feature matrix
kwargs – model-specific keywords

Return type:

array

Returns:

Prediction

score(X, y)¶: Makes score method from RegressorMixin available.

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → OpenstfRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') → OpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.
Returns:: self – The updated object.
Return type:: object

openstef.model.regressors.xgb module¶

class openstef.model.regressors.xgb.XGBOpenstfRegressor(*, objective='reg:squarederror', **kwargs)¶

Bases: XGBRegressor, OpenstfRegressor

XGB Regressor which implements the Openstf regressor API.

property can_predict_quantiles¶

Attribute that indicates if the model predict particular quantiles.

e.g. XGBQuantileOpenstfRegressor

property feature_names¶

Retrieve the model input feature names.

Returns:: The list of feature names

fit(x, y, *, callbacks=None, eval_metric=None, **kwargs)¶

Fit gradient boosting model.

Note that calling fit() multiple times will cause the model object to be re-fit from scratch. To resume training from a previous checkpoint, explicitly pass xgb_model argument.

Parameters:

X –
Feature matrix. See py-data for a list of supported types.

When the tree_method is set to hist, internally, the QuantileDMatrix will be used instead of the DMatrix for conserving memory. However, this has performance implications when the device of input data is not matched with algorithm. For instance, if the input is a numpy array on CPU but cuda is used for training, then the data is first processed on CPU then transferred to GPU.
y (array) – Labels
sample_weight – instance weights
base_margin – Global bias for each instance. See /tutorials/intercept for details.
eval_set – A list of (X, y) tuple pairs to use as validation sets, for which metrics will be computed. Validation metrics will help us track the performance of the model.
verbose – If verbose is True and an evaluation set is used, the evaluation metric measured on the validation set is printed to stdout at each boosting stage. If verbose is an integer, the evaluation metric is printed at each verbose boosting stage. The last boosting stage / the boosting stage found by using early_stopping_rounds is also printed.
xgb_model – file name of stored XGBoost model or ‘Booster’ instance XGBoost model to be loaded before training (allows training continuation).
sample_weight_eval_set – A list of the form [L_1, L_2, …, L_n], where each L_i is an array like object storing instance weights for the i-th validation set.
base_margin_eval_set – A list of the form [M_1, M_2, …, M_n], where each M_i is an array like object storing base margin for the i-th validation set.
feature_weights – Weight for each feature, defines the probability of each feature being selected when colsample is being used. All values must be greater than 0, otherwise a ValueError is thrown.

gain_importance_name = 'total_gain'¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

callbacks (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for callbacks parameter in fit.
eval_metric (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for eval_metric parameter in fit.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.

Returns:

self – The updated object.

Return type:

object

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

base_margin (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for base_margin parameter in predict.
iteration_range (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for iteration_range parameter in predict.
output_margin (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for output_margin parameter in predict.
validate_features (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for validate_features parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → XGBOpenstfRegressor¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

weight_importance_name = 'weight'¶

openstef.model.regressors.xgb_multioutput_quantile module¶

class openstef.model.regressors.xgb_multioutput_quantile.XGBMultiOutputQuantileOpenstfRegressor(quantiles=(0.9, 0.5, 0.1), gamma=0.0, colsample_bytree=1.0, subsample=1.0, min_child_weight=0, max_depth=6, learning_rate=0.22, alpha=0.0, max_delta_step=0.5, arctan_smoothing=0.055, early_stopping_rounds=None)¶

Bases: OpenstfRegressor

Model that provides multioutput quantile regression with XGBoost by default using the arctan loss function.

Arctan loss:

Refence: LaurensSluyterman/XGBoost_quantile_regression The key idea is to use a smooth approximation of the pinball loss, the arctan pinball loss, that has a relatively large second derivative.

The approximation is given by: $$L^{(text{arctan})}_{tau, s}(u) = (tau - 0.5 + frac{arctan (u/s)}{pi})u + frac{s}{pi}$$. # noqa E501

Some important settings:

The parameter in the loss function determines the amount of smoothing. A
smaller values gives a closer approximation but also a much smaller second derivative. A larger value gives more conservative quantiles when is larger than 0.5, the quantile becomes larger and vice versa. Values between 0.05 and 0.1 appear to work well. It may be a good idea to optimize this parameter.
Set min-child-weight to zero. The second derivatives can be a lot smaller
than 1 and this parameter may prevent any splits.
Use a relatively small max-delta-step. We used a default of 0.5.
This prevents excessive steps that could happen due to the relatively small second derivative.
For the same reason, use a slightly lower learning rate of 0.05.

property can_predict_quantiles¶

Attribute that indicates if the model predict particular quantiles.

e.g. XGBQuantileOpenstfRegressor

estimator_: TransformedTargetRegressor¶

property feature_names¶

Retrieve the model input feature names.

Returns:: The list of feature names

fit(x, y, eval_set=None, verbose=0, **kwargs)¶

Fits xgb quantile model.

Parameters:

x (array) – Feature matrix.
y (array) – Labels.
eval_set (Optional[Sequence[Tuple[array, array]]]) – Evaluation set to monitor training performance.
verbose (Union[bool, int, None]) – Verbosity level (disabled by default).

Return type:

OpenstfRegressor

Returns:

Fitted XGBQuantile model.

predict(x, quantile=0.5)¶

Makes a prediction for a desired quantile.

Parameters:

x (array) – Feature matrix.
quantile (float) – Quantile for which a prediciton is desired, note that only quantile are available for which a model is trained, and that this is a quantile-model specific keyword.

Return type:

array

Returns:

Prediction

Raises:

ValueError in case no model is trained for the requested quantile. –

quantile_indices_: Dict[float, int]¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for eval_set parameter in fit.
verbose (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for verbose parameter in fit.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, quantile: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') → XGBMultiOutputQuantileOpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

quantile (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for quantile parameter in predict.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

openstef.model.regressors.xgb_multioutput_quantile.replicate_for_multioutput(y, num_quantiles)¶

Replicates a 1D array to a 2D array for multioutput regression.

Parameters:

y (array) – 1D array.
num_quantiles (int) – Number of columns in the output array.

Return type:

array

Returns:

2D array with shape (len(y), num_quantiles)

openstef.model.regressors.xgb_quantile module¶

class openstef.model.regressors.xgb_quantile.XGBQuantileOpenstfRegressor(quantiles=(0.9, 0.5, 0.1), gamma=0.0, colsample_bytree=1.0, subsample=1.0, min_child_weight=1, max_depth=6, learning_rate=0.3, alpha=0.0, max_delta_step=0)¶

Bases: OpenstfRegressor

property can_predict_quantiles¶

Attribute that indicates if the model predict particular quantiles.

e.g. XGBQuantileOpenstfRegressor

property feature_names¶

Retrieve the model input feature names.

Returns:: The list of feature names

fit(x, y, **kwargs)¶

Fits xgb quantile model.

Parameters:

x (array) – Feature matrix
y (array) – Labels

Return type:

OpenstfRegressor

Returns:

Fitted XGBQuantile model

classmethod get_feature_importances_from_booster(booster)¶

Gets feauture importances from a XGB booster.

This is based on the feature_importance_ property defined in: dmlc/xgboost.

Parameters:: booster (Booster) – Booster object, most of the times the median model (quantile=0.5) is preferred
Return type:: ndarray
Returns:: Ndarray with normalized feature importances.

predict(x, quantile=0.5)¶

Makes a prediction for a desired quantile.

Parameters:

x (array) – Feature matrix
quantile (float) – Quantile for which a prediciton is desired, note that only quantile are available for which a model is trained, and that this is a quantile-model specific keyword

Return type:

array

Returns:

Prediction

Raises:

ValueError in case no model is trained for the requested quantile –

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') → XGBQuantileOpenstfRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, quantile: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') → XGBQuantileOpenstfRegressor¶

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

quantile (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for quantile parameter in predict.
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

openstef.model.regressors package¶

Submodules¶

openstef.model.regressors.arima module¶

openstef.model.regressors.custom_regressor module¶

openstef.model.regressors.dazls module¶

openstef.model.regressors.flatliner module¶

openstef.model.regressors.gblinear_quantile module¶

openstef.model.regressors.lgbm module¶

openstef.model.regressors.linear module¶

openstef.model.regressors.linear_quantile module¶

openstef.model.regressors.median module¶

openstef.model.regressors.regressor module¶

openstef.model.regressors.xgb module¶

openstef.model.regressors.xgb_multioutput_quantile module¶

openstef.model.regressors.xgb_quantile module¶

Module contents¶