openstef.metrics package¶

Submodules¶

openstef.metrics.figure module¶

This module contains all functions for generating figures.

openstef.metrics.figure.convert_to_base64_data_uri(path_in, path_out, content_type)¶

Read file, convert it to a data_uri, then writes the data_uri to file.

Parameters:

path_in (str) – Path of the file that will be converted
path_out (str) – Path of the file containing the data uri
content_type (str) – Content type of the data uri according to (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type).

Return type:

None

openstef.metrics.figure.plot_data_series(data, predict_data=None, horizon=47, names=None)¶

Plots passed data and optionally prediction data for specified horizon.

Parameters:

data (Union[list[DataFrame], list[Series]]) – There are two options to use this function. Either pass a list of pandas.DataFrame where each dataframe contains a load column and a horizon column. Or pass a list of pandas.Series with unique indexing.
predict_data (Union[list[DataFrame], list[Series]]) – Similar to data, but for prediction data instead. When passing a list of pandas.DataFrame the column forecast should exist. Can be set to None.
horizon (int) – This function will select only data matching this horizon. Defaults to 47.
names (list[str]) – The names that will be used in the legend of the plot. If None is passed, this will be build automatically based on the number of series passed.

Return type:

Figure

Returns:

A line plot of each passed data series.

Raises:

ValueError – If names is None and the number of series is greater than 3.

openstef.metrics.figure.plot_feature_importance(feature_importance)¶

Created a treemap plot based on feature importance and weights.

Parameters:: feature_importance (DataFrame) – A DataFrame describing the feature importances and weights of the trained model.
Return type:: Figure
Returns:: A treemap of the features.

openstef.metrics.metrics module¶

This module contains all metrics to assess forecast quality.

openstef.metrics.metrics.arctan_loss(y_true, y_pred, taus, s=0.1)¶

Compute the arctan pinball loss.

Note that XGBoost outputs the predictions in a slightly peculiar manner. Suppose we have 100 data points and we predict 10 quantiles. The predictions will be an array of size (1000 x 1). We first resize this to a (100x10) array where each row corresponds to the 10 predicted quantile for a single data point. We then use a for-loop (over the 10 columns) to calculate the gradients and second derivatives. Legibility was chosen over efficiency. This part can be made more efficient.

Parameters:

y_true – An array containing the true observations.
y_pred – An array containing the predicted quantiles.
taus – A list containing the true desired coverage of the quantiles.
s – A smoothing parameter.

Returns:

An array containing the (negative) gradients with respect to y_pred. hess: An array containing the second derivative with respect to y_pred.

Return type:

grad

openstef.metrics.metrics.bias(realised, forecast)¶

Function that calculates the absolute bias in % based on the true and prediciton.

Parameters:

realised (Series) – Realised load.
forecast (Series) – Forecasted load.

Return type:

float

Returns:

Bias

openstef.metrics.metrics.frac_in_stdev(realised, forecast, stdev)¶

Function that calculates the amount of measurements that are within one stdev of our predictions.

Return type:: float

openstef.metrics.metrics.franks_skill_score(realised, forecast, basecase, range_=None)¶

Calculate Franks skill score.

Return type:: float

openstef.metrics.metrics.franks_skill_score_peaks(realised, forecast, basecase)¶

Calculate Franks skill score on positive peaks.

Return type:: float

openstef.metrics.metrics.get_eval_metric_function(metric_name)¶

Gets a metric if it is available.

Parameters:: metric_name (str) – Name of the metric.
Return type:: Callable
Returns:: Function to calculate the metric.
Raises:: KeyError – If the metric is not available.

openstef.metrics.metrics.mae(realised, forecast)¶

Function that calculates the mean absolute error based on the true and prediction.

Return type:: float

openstef.metrics.metrics.nsme(realised, forecast)¶

Function that calculates the Nash-sutcliffe model efficiency based on the true and prediciton.

Parameters:

realised (Series) – Realised load.
forecast (Series) – Forecasted load.

Return type:

float

Returns:

Nash-sutcliffe model efficiency

openstef.metrics.metrics.r_mae(realised, forecast)¶

Function that calculates the relative mean absolute error based on the true and prediction.

The range is based on the load range of the previous two weeks

Return type:: float

openstef.metrics.metrics.r_mae_highest(realised, forecast, percentile=0.95)¶

Function that calculates the relative mean absolute error for the 5 percent highest realised values.

The range is based on the load range of the previous two weeks.

Raises:: ValueError – If the length of the realised and forecast arrays are not equal.
Return type:: float

openstef.metrics.metrics.r_mae_lowest(realised, forecast, quantile=0.05)¶

Function that calculates the relative mean absolute error for the 5 percent lowest realised values.

The range is based on the load range of the previous two weeks.

Return type:: float

openstef.metrics.metrics.r_mne_highest(realised, forecast)¶

Function that calculates the relative mean negative error for the 5 percent highest realised values.

The range is based on the load range of the previous two weeks, this measure quantifies how much we underestimate peaks.

Return type:: float

openstef.metrics.metrics.r_mpe_highest(realised, forecast)¶

Function that calculates the relative mean positive error for the 5 percent highest realised values.

The range is based on the load range of the previous two weeks, this measure quantifies how much we overestimate peaks.

Return type:: float

openstef.metrics.metrics.rmse(realised, forecast)¶

Function that calculates the Root Mean Square Error based on the true and prediciton.

Parameters:

realised (Series) – Realised load.
forecast (Series) – Forecasted load.

Return type:

float

Returns:

Root Mean Square Error

openstef.metrics.metrics.skill_score(realised, forecast, mean)¶

Function that calculates the skill score.

Thise indicates model performance relative to a reference, in this case the mean of the realised values. The range is based on the load range of the previous two weeks.

Return type:: float

openstef.metrics.metrics.skill_score_positive_peaks(realised, forecast, mean)¶

Calculates skill score on positive peaks.

Return type:: float

openstef.metrics.metrics.xgb_quantile_eval(preds, dmatrix, quantile=0.2)¶

Customized evaluational metric that equals to quantile regression loss (also known as pinball loss).

Quantile regression is regression that estimates a specified quantile of target’s distribution conditional on given features.

Parameters:

preds (ndarray) – Predicted values
dmatrix (DMatrix) – xgboost.DMatrix of the input data.
quantile (float) – Target quantile.

Return type:

Tuple

Returns:

Loss information

openstef.metrics.metrics.xgb_quantile_obj(preds, dmatrix, quantile=0.2)¶

Quantile regression objective fucntion.

Computes first-order derivative of quantile regression loss and a non-degenerate substitute for second-order derivative.

Substitute is returned instead of zeros, because XGBoost requires non-zero second-order derivatives. See this page: dmlc/xgboost#1825 to see why it is possible to use this trick. However, be sure that hyperparameter named max_delta_step is small enough to satisfy:0.5 * max_delta_step <= min(quantile, 1 - quantile).

Parameters:

preds (ndarray) – numpy.ndarray
dmatrix (DMatrix) – xgboost.DMatrix
quantile (float) – float between 0 and 1

Return type:

tuple[ndarray, ndarray]

Returns:

Gradient and Hessian

Reasoning for the hessian: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7#gistcomment-2322558

openstef.metrics.reporter module¶

Defines reporter class.

class openstef.metrics.reporter.Report(feature_importance_figure, data_series_figures, metrics, signature)¶

Bases: object

Dataclass to hold a report describing the training process.

class openstef.metrics.reporter.Reporter(train_data=None, validation_data=None, test_data=None, quantiles=None)¶

Bases: object

Reporter class that generates reports describing the training process.

generate_report(model)¶

Generate a report on a given model.

Parameters:: model (OpenstfRegressor) – the model to create a report on
Return type:: Report
Returns:: Reporter object containing info about the model

static get_fiabilities(quantiles, y_true)¶

Return type:: dict

static get_metrics(y_pred, y_true)¶

Calculate the metrics for a prediction.

Parameters:

y_pred (array) – np.array
y_true (array) – np.array

Return type:

dict

Returns:

Metrics for the prediction

static write_report_to_disk(report, report_folder)¶: Write report to disk; e.g. for viewing report of latest models using grafana.

openstef.metrics package¶

Submodules¶

openstef.metrics.figure module¶

openstef.metrics.metrics module¶

openstef.metrics.reporter module¶

Module contents¶