openstef.data_classes package#

Submodules#

openstef.data_classes.data_prep module#

Specifies the split function dataclass.

class openstef.data_classes.data_prep.DataPrepDataClass(**data)#

Bases: BaseModel

Class that allows to specify a custom class to prepare the data (feature engineering , etc …).

arguments: str | dict[str, Any]#
klass: str | type[DataPrepClass]#
load(required_arguments=None)#

Load the function and its arguments.

If the function and the arguments are given as strings in the instane attributes, load them as Python objects otherwise just return them from the instance attributes.

Parameters:

required_arguments (list[str]) – list of arguments the loaded class must have

Return type:

tuple[type[TypeVar(DataPrepClass)], dict[str, Any]]

Returns:

  • class (type[AbstractDataPreparation])

  • arguments (dict[str, Any])

openstef.data_classes.model_specifications module#

Specifies the dataclass for model specifications.

class openstef.data_classes.model_specifications.ModelSpecificationDataClass(**data)#

Bases: BaseModel

Holds all information regarding the training procces of a specific model.

feature_modules: list | None#

Feature modules that should be used during training.

feature_names: list | None#

Features that should be used during training.

hyper_params: dict | None#

Hyperparameters that should be used during training.

id: int | str#

openstef.data_classes.prediction_job module#

Specifies the prediction job dataclass.

class openstef.data_classes.prediction_job.PredictionJobDataClass(**data)#

Bases: BaseModel

Holds all information about the specific forecast that has to be made.

class Config#

Bases: object

Pydantic model configuration.

This following configuration is needed to prevent ids in “depends_on” to be converted from int to str when we use integer ids.

smart_union = True#
alternative_forecast_model_pid: int | str | None#

The pid that references another prediction job from which the model should be used for making forecasts.

backtest_split_func: SplitFuncDataClass | None#

Optional custom splitting function for backtesting.

completeness_threshold: float#

Minimum fraction of data that should be available for making a regular forecast.

data_prep_class: DataPrepDataClass | None#

The import string for the custom data prep class

default_modelspecs: ModelSpecificationDataClass | None#

Default model specifications

depends_on: list[int | str] | None#

Link to another prediction job on which this prediction job might depend.

description: str | None#

Optional description of the prediction job for human reference.

flatliner_threshold_minutes: int#

Number of minutes that the load has to be constant to detect a flatliner.

forecast_type: str#

The type of forecasts that should be made.

Options are:
  • "demand"

  • "wind"

  • "basecase"

If unsure what to pick, choose "demand".

get(key, default=None)#

Allows to use the get functions similar to a python dict.

Return type:

any

horizon_minutes: int | None#

The horizon of the desired forecast in minutes used in tasks. Defaults to 2880 minutes (i.e. 2 days).

hub_height: float | None#

Only required for create_wind_forecast task

id: int | str#

The predictions job id (often abreviated as pid).

lat: float | None#

Latitude of the forecasted location in degrees. Used for fetching weather data in tasks, calculating derrived features and component splitting.

lon: float | None#

Longitude of the forecasted location in degrees. Used for fetching weather data in tasks, calculating derrived features and component splitting.

minimal_table_length: int#

Minimum length (in rows) of the forecast input for making a regular forecast.

model: str#

The model type that should be used.

Options are:
  • "xgb"

  • "xgb_quantile"

  • "lgb"

  • "linear"

  • "linear_quantile"

If unsure what to pick, choose "xgb".

model_kwargs: dict | None#

The model parameters that should be used.

n_turbines: float | None#

Only required for create_wind_forecast task

name: str#

Name of the forecast, e.g. the location name.

pipelines_to_run: list[PipelineType]#

The pipelines to run for this pj

quantiles: list[float] | None#

Quantiles that have to be forecasted.

resolution_minutes: int#

The resolution of the desired forecast in minutes.

save_train_forecasts: bool#

Indicate wether the forecasts produced during the training process should be saved.

sid: str | None#

Only required for create_solar_forecast task

train_components: bool | None#

Whether splitting the forecasts in wind, solar, rest is desired.

train_horizons_minutes: list[int] | None#

List of horizons that should be taken into account during training.

train_split_func: SplitFuncDataClass | None#

Optional custom splitting function for operational procces.

turbine_type: str | None#

Only required for create_wind_forecast task

openstef.data_classes.split_function module#

Specifies the split function dataclass.

class openstef.data_classes.split_function.SplitFuncDataClass(**data)#

Bases: BaseModel

Class that allows to specify a custom function to generate a train, test and validation set.

arguments: str | dict[str, Any]#
function: str | Callable#
load(required_arguments=None)#

Load the function and its arguments.

If the function and the arguments are given as strings in the instane attributes, load them as Python objects otherwise just return them from the instance attributes.

Parameters:

required_arguments (list[str]) – list of arguments the loaded function must have

Return type:

tuple[Callable, dict[str, Any]]

Returns:

  • function (Callable)

  • arguments (dict[str, Any])

Module contents#