openstef.metrics package¶
Submodules¶
openstef.metrics.figure module¶
This module contains all functions for generating figures.
- openstef.metrics.figure.convert_to_base64_data_uri(path_in, path_out, content_type)¶
Read file, convert it to a data_uri, then writes the data_uri to file.
- Parameters:
path_in (
str
) – Path of the file that will be convertedpath_out (
str
) – Path of the file containing the data uricontent_type (
str
) – Content type of the data uri according to (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type).
- Return type:
None
- openstef.metrics.figure.plot_data_series(data, predict_data=None, horizon=47, names=None)¶
Plots passed data and optionally prediction data for specified horizon.
- Parameters:
data (
Union
[list
[DataFrame
],list
[Series
]]) – There are two options to use this function. Either pass a list of pandas.DataFrame where each dataframe contains a load column and a horizon column. Or pass a list of pandas.Series with unique indexing.predict_data (
Union
[list
[DataFrame
],list
[Series
]]) – Similar to data, but for prediction data instead. When passing a list of pandas.DataFrame the column forecast should exist. Can be set to None.horizon (
int
) – This function will select only data matching this horizon. Defaults to 47.names (
list
[str
]) – The names that will be used in the legend of the plot. If None is passed, this will be build automatically based on the number of series passed.
- Return type:
Figure
- Returns:
A line plot of each passed data series.
- Raises:
ValueError – If names is None and the number of series is greater than 3.
- openstef.metrics.figure.plot_feature_importance(feature_importance)¶
Created a treemap plot based on feature importance and weights.
- Parameters:
feature_importance (
DataFrame
) – A DataFrame describing the feature importances and weights of the trained model.- Return type:
Figure
- Returns:
A treemap of the features.
openstef.metrics.metrics module¶
This module contains all metrics to assess forecast quality.
- openstef.metrics.metrics.arctan_loss(y_true, y_pred, taus, s=0.1)¶
Compute the arctan pinball loss.
Note that XGBoost outputs the predictions in a slightly peculiar manner. Suppose we have 100 data points and we predict 10 quantiles. The predictions will be an array of size (1000 x 1). We first resize this to a (100x10) array where each row corresponds to the 10 predicted quantile for a single data point. We then use a for-loop (over the 10 columns) to calculate the gradients and second derivatives. Legibility was chosen over efficiency. This part can be made more efficient.
- Parameters:
y_true – An array containing the true observations.
y_pred – An array containing the predicted quantiles.
taus – A list containing the true desired coverage of the quantiles.
s – A smoothing parameter.
- Returns:
An array containing the (negative) gradients with respect to y_pred. hess: An array containing the second derivative with respect to y_pred.
- Return type:
grad
- openstef.metrics.metrics.bias(realised, forecast)¶
Function that calculates the absolute bias in % based on the true and prediciton.
- Parameters:
realised (
Series
) – Realised load.forecast (
Series
) – Forecasted load.
- Return type:
float
- Returns:
Bias
- openstef.metrics.metrics.frac_in_stdev(realised, forecast, stdev)¶
Function that calculates the amount of measurements that are within one stdev of our predictions.
- Return type:
float
- openstef.metrics.metrics.franks_skill_score(realised, forecast, basecase, range_=1.0)¶
Calculate Franks skill score.
- Return type:
float
- openstef.metrics.metrics.franks_skill_score_peaks(realised, forecast, basecase)¶
Calculate Franks skill score on positive peaks.
- Return type:
float
- openstef.metrics.metrics.get_eval_metric_function(metric_name)¶
Gets a metric if it is available.
- Parameters:
metric_name (
str
) – Name of the metric.- Return type:
Callable
- Returns:
Function to calculate the metric.
- Raises:
KeyError – If the metric is not available.
- openstef.metrics.metrics.mae(realised, forecast)¶
Function that calculates the mean absolute error based on the true and prediction.
- Return type:
float
- openstef.metrics.metrics.nsme(realised, forecast)¶
Function that calculates the Nash-sutcliffe model efficiency based on the true and prediciton.
- Parameters:
realised (
Series
) – Realised load.forecast (
Series
) – Forecasted load.
- Return type:
float
- Returns:
Nash-sutcliffe model efficiency
- openstef.metrics.metrics.r_mae(realised, forecast)¶
Function that calculates the relative mean absolute error based on the true and prediction.
The range is based on the load range of the previous two weeks
- Return type:
float
- openstef.metrics.metrics.r_mae_highest(realised, forecast, percentile=0.95)¶
Function that calculates the relative mean absolute error for the 5 percent highest realised values.
The range is based on the load range of the previous two weeks.
- Raises:
ValueError – If the length of the realised and forecast arrays are not equal.
- Return type:
float
- openstef.metrics.metrics.r_mae_lowest(realised, forecast, quantile=0.05)¶
Function that calculates the relative mean absolute error for the 5 percent lowest realised values.
The range is based on the load range of the previous two weeks.
- Return type:
float
- openstef.metrics.metrics.r_mne_highest(realised, forecast)¶
Function that calculates the relative mean negative error for the 5 percent highest realised values.
The range is based on the load range of the previous two weeks, this measure quantifies how much we underestimate peaks.
- Return type:
float
- openstef.metrics.metrics.r_mpe_highest(realised, forecast)¶
Function that calculates the relative mean positive error for the 5 percent highest realised values.
The range is based on the load range of the previous two weeks, this measure quantifies how much we overestimate peaks.
- Return type:
float
- openstef.metrics.metrics.rmse(realised, forecast)¶
Function that calculates the Root Mean Square Error based on the true and prediciton.
- Parameters:
realised (
Series
) – Realised load.forecast (
Series
) – Forecasted load.
- Return type:
float
- Returns:
Root Mean Square Error
- openstef.metrics.metrics.skill_score(realised, forecast, mean)¶
Function that calculates the skill score.
Thise indicates model performance relative to a reference, in this case the mean of the realised values. The range is based on the load range of the previous two weeks.
- Return type:
float
- openstef.metrics.metrics.skill_score_positive_peaks(realised, forecast, mean)¶
Calculates skill score on positive peaks.
- Return type:
float
- openstef.metrics.metrics.xgb_quantile_eval(preds, dmatrix, quantile=0.2)¶
Customized evaluational metric that equals to quantile regression loss (also known as pinball loss).
Quantile regression is regression that estimates a specified quantile of target’s distribution conditional on given features.
- Parameters:
preds (
ndarray
) – Predicted valuesdmatrix (
DMatrix
) – xgboost.DMatrix of the input data.quantile (
float
) – Target quantile.
- Return type:
str
- Returns:
Loss information
# See also: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7
- openstef.metrics.metrics.xgb_quantile_obj(preds, dmatrix, quantile=0.2)¶
Quantile regression objective fucntion.
Computes first-order derivative of quantile regression loss and a non-degenerate substitute for second-order derivative.
Substitute is returned instead of zeros, because XGBoost requires non-zero second-order derivatives. See this page: dmlc/xgboost#1825 to see why it is possible to use this trick. However, be sure that hyperparameter named max_delta_step is small enough to satisfy:
0.5 * max_delta_step <= min(quantile, 1 - quantile)
.- Parameters:
preds (
ndarray
) – numpy.ndarraydmatrix (
DMatrix
) – xgboost.DMatrixquantile (
float
) – float between 0 and 1
- Return type:
tuple
[ndarray
,ndarray
]- Returns:
Gradient and Hessian
# See also: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7
Reasoning for the hessian: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7#gistcomment-2322558
openstef.metrics.reporter module¶
Defines reporter class.
- class openstef.metrics.reporter.Report(feature_importance_figure, data_series_figures, metrics, signature)¶
Bases:
object
Dataclass to hold a report describing the training process.
- class openstef.metrics.reporter.Reporter(train_data=None, validation_data=None, test_data=None, quantiles=None)¶
Bases:
object
Reporter class that generates reports describing the training process.
- generate_report(model)¶
Generate a report on a given model.
- Parameters:
model (
OpenstfRegressor
) – the model to create a report on- Return type:
- Returns:
Reporter object containing info about the model
- static get_fiabilities(quantiles, y_true)¶
- Return type:
dict
- static get_metrics(y_pred, y_true)¶
Calculate the metrics for a prediction.
- Parameters:
y_pred (
array
) – np.arrayy_true (
array
) – np.array
- Return type:
dict
- Returns:
Metrics for the prediction
- static write_report_to_disk(report, report_folder)¶
Write report to disk; e.g. for viewing report of latest models using grafana.