openstef.data_classes package¶
Submodules¶
openstef.data_classes.data_prep module¶
Specifies the split function dataclass.
- class openstef.data_classes.data_prep.DataPrepDataClass(**data)¶
Bases:
BaseModel
Class that allows to specify a custom class to prepare the data (feature engineering , etc …).
- arguments: str | dict[str, Any]¶
- klass: str | type[DataPrepClass]¶
- load(required_arguments=None)¶
Load the function and its arguments.
If the function and the arguments are given as strings in the instane attributes, load them as Python objects otherwise just return them from the instance attributes.
- Parameters:
required_arguments (list[str]) – list of arguments the loaded class must have
- Return type:
tuple
[type
[TypeVar
(DataPrepClass
)],dict
[str
,Any
]]- Returns:
class (type[AbstractDataPreparation])
arguments (dict[str, Any])
openstef.data_classes.model_specifications module¶
Specifies the dataclass for model specifications.
- class openstef.data_classes.model_specifications.ModelSpecificationDataClass(**data)¶
Bases:
BaseModel
Holds all information regarding the training procces of a specific model.
- feature_modules: list | None¶
Feature modules that should be used during training.
- feature_names: list | None¶
Features that should be used during training.
- hyper_params: dict | None¶
Hyperparameters that should be used during training.
- id: int | str¶
openstef.data_classes.prediction_job module¶
Specifies the prediction job dataclass.
- class openstef.data_classes.prediction_job.PredictionJobDataClass(**data)¶
Bases:
BaseModel
Holds all information about the specific forecast that has to be made.
- class Config¶
Bases:
object
Pydantic model configuration.
This following configuration is needed to prevent ids in “depends_on” to be converted from int to str when we use integer ids.
- smart_union = True¶
- alternative_forecast_model_pid: int | str | None¶
The pid that references another prediction job from which the model should be used for making forecasts.
- backtest_split_func: SplitFuncDataClass | None¶
Optional custom splitting function for backtesting.
- completeness_threshold: float¶
Minimum fraction of data that should be available for making a regular forecast.
- data_balancing_ratio: float | None¶
If data balancing is enabled, the data will be balanced with data from 1 year ago in the future.
- data_prep_class: DataPrepDataClass | None¶
The import string for the custom data prep class
- default_modelspecs: ModelSpecificationDataClass | None¶
Default model specifications
- depends_on: list[int | str] | None¶
Link to another prediction job on which this prediction job might depend.
- description: str | None¶
Optional description of the prediction job for human reference.
- electricity_bidding_zone: BiddingZone | None¶
Name of the forecast, e.g. the location name.
- flatliner_threshold_minutes: int¶
Number of minutes that the load has to be constant to detect a flatliner.
- forecast_type: str¶
The type of forecasts that should be made.
- Options are:
"demand"
"wind"
"basecase"
If unsure what to pick, choose
"demand"
.
- get(key, default=None)¶
Allows to use the get functions similar to a python dict.
- Return type:
any
- horizon_minutes: int | None¶
The horizon of the desired forecast in minutes used in tasks. Defaults to 2880 minutes (i.e. 2 days).
- hub_height: float | None¶
Only required for create_wind_forecast task
- id: int | str¶
The predictions job id (often abreviated as pid).
- lat: float | None¶
Latitude of the forecasted location in degrees. Used for fetching weather data in tasks, calculating derrived features and component splitting.
- lon: float | None¶
Longitude of the forecasted location in degrees. Used for fetching weather data in tasks, calculating derrived features and component splitting.
- minimal_table_length: int¶
Minimum length (in rows) of the forecast input for making a regular forecast.
- model: str¶
The model type that should be used.
- Options are:
"xgb"
"xgb_quantile"
"lgb"
"linear"
"linear_quantile"
"xgb_multioutput_quantile"
"flatliner"
If unsure what to pick, choose
"xgb"
.
- model_kwargs: dict | None¶
The model parameters that should be used.
- n_turbines: float | None¶
Only required for create_wind_forecast task
- name: str¶
Bidding zone is used to determine the electricity price. It is also used to determine the holidays that should be used. Currently only ENTSO-E bidding zones are supported.
- pipelines_to_run: list[PipelineType]¶
The pipelines to run for this pj
- quantiles: list[float] | None¶
Quantiles that have to be forecasted.
- resolution_minutes: int¶
The resolution of the desired forecast in minutes.
- save_train_forecasts: bool¶
Indicate wether the forecasts produced during the training process should be saved.
- sid: str | None¶
Only required for create_solar_forecast task
- train_components: bool | None¶
Whether splitting the forecasts in wind, solar, rest is desired.
- train_horizons_minutes: list[int] | None¶
List of horizons that should be taken into account during training.
- train_split_func: SplitFuncDataClass | None¶
Optional custom splitting function for operational procces.
- turbine_type: str | None¶
Only required for create_wind_forecast task
openstef.data_classes.split_function module¶
Specifies the split function dataclass.
- class openstef.data_classes.split_function.SplitFuncDataClass(**data)¶
Bases:
BaseModel
Class that allows to specify a custom function to generate a train, test and validation set.
- arguments: str | dict[str, Any]¶
- function: str | Callable¶
- load(required_arguments=None)¶
Load the function and its arguments.
If the function and the arguments are given as strings in the instane attributes, load them as Python objects otherwise just return them from the instance attributes.
- Parameters:
required_arguments (list[str]) – list of arguments the loaded function must have
- Return type:
tuple
[Callable
,dict
[str
,Any
]]- Returns:
function (Callable)
arguments (dict[str, Any])