openstef.metrics package#
Submodules#
openstef.metrics.figure module#
This module contains all functions for generating figures.
- openstef.metrics.figure.convert_to_base64_data_uri(path_in, path_out, content_type)#
Read file, convert it to a data_uri, then writes the data_uri to file.
- Parameters:
path_in (
str
) – Path of the file that will be convertedpath_out (
str
) – Path of the file containing the data uricontent_type (
str
) – Content type of the data uri according to (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type).
- Return type:
None
- openstef.metrics.figure.plot_data_series(data, predict_data=None, horizon=47, names=None)#
Plots passed data and optionally prediction data for specified horizon.
- Parameters:
data (
Union
[list
[DataFrame
],list
[Series
]]) – There are two options to use this function. Either pass a list of pandas.DataFrame where each dataframe contains a load column and a horizon column. Or pass a list of pandas.Series with unique indexing.predict_data (
Union
[list
[DataFrame
],list
[Series
],None
]) – Similar to data, but for prediction data instead. When passing a list of pandas.DataFrame the column forecast should exist. Can be set to None.horizon (
int
) – This function will select only data matching this horizon. Defaults to 47.names (
Optional
[list
[str
]]) – The names that will be used in the legend of the plot. If None is passed, this will be build automatically based on the number of series passed.
- Return type:
Figure
- Returns:
A line plot of each passed data series.
- openstef.metrics.figure.plot_feature_importance(feature_importance)#
Created a treemap plot based on feature importance and weights.
- Parameters:
feature_importance (
DataFrame
) – A DataFrame describing the feature importances and weights of the trained model.- Return type:
Figure
- Returns:
A treemap of the features.
openstef.metrics.metrics module#
This module contains all metrics to assess forecast quality.
- openstef.metrics.metrics.bias(realised, forecast)#
Function that calculates the absolute bias in % based on the true and prediciton.
- Parameters:
realised (
Series
) – Realised load.forecast (
Series
) – Forecasted load.
- Return type:
float
- Returns:
Bias
- openstef.metrics.metrics.frac_in_stdev(realised, forecast, stdev)#
Function that calculates the amount of measurements that are within one stdev of our predictions.
- Return type:
float
- openstef.metrics.metrics.franks_skill_score(realised, forecast, basecase, range_=1.0)#
Calculate Franks skill score.
- Return type:
float
- openstef.metrics.metrics.franks_skill_score_peaks(realised, forecast, basecase)#
Calculate Franks skill score on positive peaks.
- Return type:
float
- openstef.metrics.metrics.get_eval_metric_function(metric_name)#
Gets a metric if it is available.
- Parameters:
metric_name (
str
) – Name of the metric.- Return type:
Callable
- Returns:
Function to calculate the metric.
- openstef.metrics.metrics.mae(realised, forecast)#
Function that calculates the mean absolute error based on the true and prediction.
- Return type:
float
- openstef.metrics.metrics.nsme(realised, forecast)#
Function that calculates the Nash-sutcliffe model efficiency based on the true and prediciton.
- Parameters:
realised (
Series
) – Realised load.forecast (
Series
) – Forecasted load.
- Return type:
float
- Returns:
Nash-sutcliffe model efficiency
- openstef.metrics.metrics.r_mae(realised, forecast)#
Function that calculates the relative mean absolute error based on the true and prediction.
The range is based on the load range of the previous two weeks
- Return type:
float
- openstef.metrics.metrics.r_mae_highest(realised, forecast, percentile=0.95)#
Function that calculates the relative mean absolute error for the 5 percent highest realised values.
The range is based on the load range of the previous two weeks.
- Return type:
float
- openstef.metrics.metrics.r_mae_lowest(realised, forecast, quantile=0.05)#
Function that calculates the relative mean absolute error for the 5 percent lowest realised values.
The range is based on the load range of the previous two weeks.
- Return type:
float
- openstef.metrics.metrics.r_mne_highest(realised, forecast)#
Function that calculates the relative mean negative error for the 5 percent highest realised values.
The range is based on the load range of the previous two weeks, this measure quantifies how much we underestimate peaks.
- Return type:
float
- openstef.metrics.metrics.r_mpe_highest(realised, forecast)#
Function that calculates the relative mean positive error for the 5 percent highest realised values.
The range is based on the load range of the previous two weeks, this measure quantifies how much we overestimate peaks.
- Return type:
float
- openstef.metrics.metrics.rmse(realised, forecast)#
Function that calculates the Root Mean Square Error based on the true and prediciton.
- Parameters:
realised (
Series
) – Realised load.forecast (
Series
) – Forecasted load.
- Return type:
float
- Returns:
Root Mean Square Error
- openstef.metrics.metrics.skill_score(realised, forecast, mean)#
Function that calculates the skill score.
Thise indicates model performance relative to a reference, in this case the mean of the realised values. The range is based on the load range of the previous two weeks.
- Return type:
float
- openstef.metrics.metrics.skill_score_positive_peaks(realised, forecast, mean)#
Calculates skill score on positive peaks.
- Return type:
float
- openstef.metrics.metrics.xgb_quantile_eval(preds, dmatrix, quantile=0.2)#
Customized evaluational metric that equals to quantile regression loss (also known as pinball loss).
Quantile regression is regression that estimates a specified quantile of target’s distribution conditional on given features.
- Parameters:
preds (
ndarray
) – Predicted valuesdmatrix (
DMatrix
) – xgboost.DMatrix of the input data.quantile (
float
) – Target quantile.
- Return type:
str
- Returns:
Loss information
# See also: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7
- openstef.metrics.metrics.xgb_quantile_obj(preds, dmatrix, quantile=0.2)#
Quantile regression objective fucntion.
Computes first-order derivative of quantile regression loss and a non-degenerate substitute for second-order derivative.
Substitute is returned instead of zeros, because XGBoost requires non-zero second-order derivatives. See this page: dmlc/xgboost#1825 to see why it is possible to use this trick. However, be sure that hyperparameter named max_delta_step is small enough to satisfy:
0.5 * max_delta_step <= min(quantile, 1 - quantile)
.- Parameters:
preds (
ndarray
) – numpy.ndarraydmatrix (
DMatrix
) – xgboost.DMatrixquantile (
float
) – float
- Return type:
tuple
[ndarray
,ndarray
]- Returns:
Gradient and Hessian
# See also: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7
Reasoning for the hessian: https://gist.github.com/Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7#gistcomment-2322558
openstef.metrics.reporter module#
Defines reporter class.
- class openstef.metrics.reporter.Report(feature_importance_figure, data_series_figures, metrics, signature)#
Bases:
object
Dataclass to hold a report describing the training process.
- class openstef.metrics.reporter.Reporter(train_data=None, validation_data=None, test_data=None, quantiles=None)#
Bases:
object
Reporter class that generates reports describing the training process.
- generate_report(model)#
Generate a report on a given model.
- Parameters:
model (
OpenstfRegressor
) – the model to create a report on- Return type:
- Returns:
Reporter object containing info about the model
- static get_fiabilities(quantiles, y_true)#
- Return type:
dict
- static get_metrics(y_pred, y_true)#
Calculate the metrics for a prediction.
- Parameters:
y_pred (
array
) – np.arrayy_true (
array
) – np.array
- Return type:
dict
- Returns:
Metrics for the prediction
- static write_report_to_disk(report, report_folder)#
Write report to disk; e.g. for viewing report of latest models using grafana.