openstef.data_classes package

Submodules

openstef.data_classes.data_prep module

Specifies the split function dataclass.

class openstef.data_classes.data_prep.DataPrepDataClass(**data)

Bases: BaseModel

Class that allows to specify a custom class to prepare the data (feature engineering , etc …).

arguments: str | dict[str, Any]
klass: str | type[DataPrepClass]
load(required_arguments=None)

Load the function and its arguments.

If the function and the arguments are given as strings in the instane attributes, load them as Python objects otherwise just return them from the instance attributes.

Parameters:

required_arguments (list[str]) – list of arguments the loaded class must have

Return type:

tuple[type[TypeVar(DataPrepClass)], dict[str, Any]]

Returns:

  • class (type[AbstractDataPreparation])

  • arguments (dict[str, Any])

openstef.data_classes.model_specifications module

Specifies the dataclass for model specifications.

class openstef.data_classes.model_specifications.ModelSpecificationDataClass(**data)

Bases: BaseModel

Holds all information regarding the training procces of a specific model.

feature_modules: list | None

Feature modules that should be used during training.

feature_names: list | None

Features that should be used during training.

hyper_params: dict | None

Hyperparameters that should be used during training.

id: int | str

openstef.data_classes.prediction_job module

Specifies the prediction job dataclass.

class openstef.data_classes.prediction_job.PredictionJobDataClass(**data)

Bases: BaseModel

Holds all information about the specific forecast that has to be made.

class Config

Bases: object

Pydantic model configuration.

This following configuration is needed to prevent ids in “depends_on” to be converted from int to str when we use integer ids.

smart_union = True
alternative_forecast_model_pid: int | str | None

The pid that references another prediction job from which the model should be used for making forecasts.

backtest_split_func: SplitFuncDataClass | None

Optional custom splitting function for backtesting.

completeness_threshold: float

Minimum fraction of data that should be available for making a regular forecast.

data_prep_class: DataPrepDataClass | None

The import string for the custom data prep class

default_modelspecs: ModelSpecificationDataClass | None

Default model specifications

depends_on: list[int | str] | None

Link to another prediction job on which this prediction job might depend.

description: str | None

Optional description of the prediction job for human reference.

flatliner_threshold_minutes: int

Number of minutes that the load has to be constant to detect a flatliner.

forecast_type: str

The type of forecasts that should be made.

Options are:
  • "demand"

  • "wind"

  • "basecase"

If unsure what to pick, choose "demand".

get(key, default=None)

Allows to use the get functions similar to a python dict.

Return type:

any

horizon_minutes: int | None

The horizon of the desired forecast in minutes used in tasks. Defaults to 2880 minutes (i.e. 2 days).

hub_height: float | None

Only required for create_wind_forecast task

id: int | str

The predictions job id (often abreviated as pid).

lat: float | None

Latitude of the forecasted location in degrees. Used for fetching weather data in tasks, calculating derrived features and component splitting.

lon: float | None

Longitude of the forecasted location in degrees. Used for fetching weather data in tasks, calculating derrived features and component splitting.

minimal_table_length: int

Minimum length (in rows) of the forecast input for making a regular forecast.

model: str

The model type that should be used.

Options are:
  • "xgb"

  • "xgb_quantile"

  • "lgb"

  • "linear"

  • "linear_quantile"

  • "xgb_multioutput_quantile"

  • "flatliner"

If unsure what to pick, choose "xgb".

model_kwargs: dict | None

The model parameters that should be used.

n_turbines: float | None

Only required for create_wind_forecast task

name: str

Name of the forecast, e.g. the location name.

pipelines_to_run: list[PipelineType]

The pipelines to run for this pj

quantiles: list[float] | None

Quantiles that have to be forecasted.

resolution_minutes: int

The resolution of the desired forecast in minutes.

save_train_forecasts: bool

Indicate wether the forecasts produced during the training process should be saved.

sid: str | None

Only required for create_solar_forecast task

train_components: bool | None

Whether splitting the forecasts in wind, solar, rest is desired.

train_horizons_minutes: list[int] | None

List of horizons that should be taken into account during training.

train_split_func: SplitFuncDataClass | None

Optional custom splitting function for operational procces.

turbine_type: str | None

Only required for create_wind_forecast task

openstef.data_classes.split_function module

Specifies the split function dataclass.

class openstef.data_classes.split_function.SplitFuncDataClass(**data)

Bases: BaseModel

Class that allows to specify a custom function to generate a train, test and validation set.

arguments: str | dict[str, Any]
function: str | Callable
load(required_arguments=None)

Load the function and its arguments.

If the function and the arguments are given as strings in the instane attributes, load them as Python objects otherwise just return them from the instance attributes.

Parameters:

required_arguments (list[str]) – list of arguments the loaded function must have

Return type:

tuple[Callable, dict[str, Any]]

Returns:

  • function (Callable)

  • arguments (dict[str, Any])

Module contents