openstef.metrics package

Submodules

openstef.metrics.figure module

This module contains all functions for generating figures.

openstef.metrics.figure.convert_to_base64_data_uri(path_in, path_out, content_type)

Read file, convert it to a data_uri, then writes the data_uri to file.

Parameters:
Return type:

None

openstef.metrics.figure.plot_data_series(data, predict_data=None, horizon=47, names=None)

Plots passed data and optionally prediction data for specified horizon.

Parameters:
  • data (Union[list[DataFrame], list[Series]]) – There are two options to use this function. Either pass a list of pandas.DataFrame where each dataframe contains a load column and a horizon column. Or pass a list of pandas.Series with unique indexing.

  • predict_data (Union[list[DataFrame], list[Series]]) – Similar to data, but for prediction data instead. When passing a list of pandas.DataFrame the column forecast should exist. Can be set to None.

  • horizon (int) – This function will select only data matching this horizon. Defaults to 47.

  • names (list[str]) – The names that will be used in the legend of the plot. If None is passed, this will be build automatically based on the number of series passed.

Return type:

Figure

Returns:

A line plot of each passed data series.

Raises:

ValueError – If names is None and the number of series is greater than 3.

openstef.metrics.figure.plot_feature_importance(feature_importance)

Created a treemap plot based on feature importance and weights.

Parameters:

feature_importance (DataFrame) – A DataFrame describing the feature importances and weights of the trained model.

Return type:

Figure

Returns:

A treemap of the features.

openstef.metrics.metrics module

This module contains all metrics to assess forecast quality.

openstef.metrics.metrics.arctan_loss(y_true, y_pred, taus, s=0.1)

Compute the arctan pinball loss.

Note that XGBoost outputs the predictions in a slightly peculiar manner. Suppose we have 100 data points and we predict 10 quantiles. The predictions will be an array of size (1000 x 1). We first resize this to a (100x10) array where each row corresponds to the 10 predicted quantile for a single data point. We then use a for-loop (over the 10 columns) to calculate the gradients and second derivatives. Legibility was chosen over efficiency. This part can be made more efficient.

Parameters:
  • y_true – An array containing the true observations.

  • y_pred – An array containing the predicted quantiles.

  • taus – A list containing the true desired coverage of the quantiles.

  • s – A smoothing parameter.

Returns:

An array containing the (negative) gradients with respect to y_pred. hess: An array containing the second derivative with respect to y_pred.

Return type:

grad

openstef.metrics.metrics.bias(realised, forecast)

Function that calculates the absolute bias in % based on the true and prediciton.

Parameters:
  • realised (Series) – Realised load.

  • forecast (Series) – Forecasted load.

Return type:

float

Returns:

Bias

openstef.metrics.metrics.frac_in_stdev(realised, forecast, stdev)

Function that calculates the amount of measurements that are within one stdev of our predictions.

Return type:

float

openstef.metrics.metrics.franks_skill_score(realised, forecast, basecase, range_=1.0)

Calculate Franks skill score.

Return type:

float

openstef.metrics.metrics.franks_skill_score_peaks(realised, forecast, basecase)

Calculate Franks skill score on positive peaks.

Return type:

float

openstef.metrics.metrics.get_eval_metric_function(metric_name)

Gets a metric if it is available.

Parameters:

metric_name (str) – Name of the metric.

Return type:

Callable

Returns:

Function to calculate the metric.

Raises:

KeyError – If the metric is not available.

openstef.metrics.metrics.mae(realised, forecast)

Function that calculates the mean absolute error based on the true and prediction.

Return type:

float

openstef.metrics.metrics.nsme(realised, forecast)

Function that calculates the Nash-sutcliffe model efficiency based on the true and prediciton.

Parameters:
  • realised (Series) – Realised load.

  • forecast (Series) – Forecasted load.

Return type:

float

Returns:

Nash-sutcliffe model efficiency

openstef.metrics.metrics.r_mae(realised, forecast)

Function that calculates the relative mean absolute error based on the true and prediction.

The range is based on the load range of the previous two weeks

Return type:

float

openstef.metrics.metrics.r_mae_highest(realised, forecast, percentile=0.95)

Function that calculates the relative mean absolute error for the 5 percent highest realised values.

The range is based on the load range of the previous two weeks.

Raises:

ValueError – If the length of the realised and forecast arrays are not equal.

Return type:

float

openstef.metrics.metrics.r_mae_lowest(realised, forecast, quantile=0.05)

Function that calculates the relative mean absolute error for the 5 percent lowest realised values.

The range is based on the load range of the previous two weeks.

Return type:

float

openstef.metrics.metrics.r_mne_highest(realised, forecast)

Function that calculates the relative mean negative error for the 5 percent highest realised values.

The range is based on the load range of the previous two weeks, this measure quantifies how much we underestimate peaks.

Return type:

float

openstef.metrics.metrics.r_mpe_highest(realised, forecast)

Function that calculates the relative mean positive error for the 5 percent highest realised values.

The range is based on the load range of the previous two weeks, this measure quantifies how much we overestimate peaks.

Return type:

float

openstef.metrics.metrics.rmse(realised, forecast)

Function that calculates the Root Mean Square Error based on the true and prediciton.

Parameters:
  • realised (Series) – Realised load.

  • forecast (Series) – Forecasted load.

Return type:

float

Returns:

Root Mean Square Error

openstef.metrics.metrics.skill_score(realised, forecast, mean)

Function that calculates the skill score.

Thise indicates model performance relative to a reference, in this case the mean of the realised values. The range is based on the load range of the previous two weeks.

Return type:

float

openstef.metrics.metrics.skill_score_positive_peaks(realised, forecast, mean)

Calculates skill score on positive peaks.

Return type:

float

openstef.metrics.metrics.xgb_quantile_eval(preds, dmatrix, quantile=0.2)

Customized evaluational metric that equals to quantile regression loss (also known as pinball loss).

Quantile regression is regression that estimates a specified quantile of target’s distribution conditional on given features.

Parameters:
  • preds (ndarray) – Predicted values

  • dmatrix (DMatrix) – xgboost.DMatrix of the input data.

  • quantile (float) – Target quantile.

Return type:

str

Returns:

Loss information

# See also: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7

openstef.metrics.metrics.xgb_quantile_obj(preds, dmatrix, quantile=0.2)

Quantile regression objective fucntion.

Computes first-order derivative of quantile regression loss and a non-degenerate substitute for second-order derivative.

Substitute is returned instead of zeros, because XGBoost requires non-zero second-order derivatives. See this page: dmlc/xgboost#1825 to see why it is possible to use this trick. However, be sure that hyperparameter named max_delta_step is small enough to satisfy:0.5 * max_delta_step <= min(quantile, 1 - quantile).

Parameters:
  • preds (ndarray) – numpy.ndarray

  • dmatrix (DMatrix) – xgboost.DMatrix

  • quantile (float) – float between 0 and 1

Return type:

tuple[ndarray, ndarray]

Returns:

Gradient and Hessian

# See also: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7

Reasoning for the hessian: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7#gistcomment-2322558

openstef.metrics.reporter module

Defines reporter class.

class openstef.metrics.reporter.Report(feature_importance_figure, data_series_figures, metrics, signature)

Bases: object

Dataclass to hold a report describing the training process.

class openstef.metrics.reporter.Reporter(train_data=None, validation_data=None, test_data=None, quantiles=None)

Bases: object

Reporter class that generates reports describing the training process.

generate_report(model)

Generate a report on a given model.

Parameters:

model (OpenstfRegressor) – the model to create a report on

Return type:

Report

Returns:

Reporter object containing info about the model

static get_fiabilities(quantiles, y_true)
Return type:

dict

static get_metrics(y_pred, y_true)

Calculate the metrics for a prediction.

Parameters:
  • y_pred (array) – np.array

  • y_true (array) – np.array

Return type:

dict

Returns:

Metrics for the prediction

static write_report_to_disk(report, report_folder)

Write report to disk; e.g. for viewing report of latest models using grafana.

Module contents