Scaler#

class openstef_models.transforms.general.Scaler(**data: Any) None[source]

Bases: BaseConfig, TimeSeriesTransform

Transform that scales time series data using various scikit-learn scaling methods.

Available methods include:

  • MinMaxScaler: Scales features based on min/max of training set (between 0 and 1).

  • MaxAbs: Scales features by their maximum absolute value (between -1 and 1).

  • Standard: Standardizes features by removing the mean and scaling to unit variance (0 mean, 1 std).

  • Robust: Scales features using statistics that are robust to outliers: median and IQR.

Example

>>> import pandas as pd
>>> from datetime import timedelta
>>> from openstef_core.datasets import TimeSeriesDataset
>>> from openstef_models.transforms.general import Scaler
>>>
>>> # Create sample data
>>> data = pd.DataFrame({
...     'load': [100, 200, 300],
...     'temperature': [20, 25, 30]
... }, index=pd.date_range('2025-01-01', periods=3, freq='h'))
>>> dataset = TimeSeriesDataset(data, timedelta(hours=1))
>>>
>>> # Initialize and apply transform
>>> scaler = Scaler(method="standard")
>>> scaler.fit(dataset)
>>> transformed_dataset = scaler.transform(dataset)
>>> abs(float(transformed_dataset.data['load'].mean().round(6)))
0.0
>>> # use ddof=0 to get population std (as used by StandardScaler)
>>> float(transformed_dataset.data['load'].std(ddof=0).round(6))
1.0
>>> abs(float(transformed_dataset.data['temperature'].mean().round(6)))
0.0
>>> float(transformed_dataset.data['temperature'].std(ddof=0).round(6))
1.0
Parameters:

data (Any)

method: TypeAliasType
selection: FeatureSelection
property is_fitted: bool

Check if the transform has been fitted.

model_post_init(context: Any) None[source]

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

context (Any)

Return type:

None

fit(data: TimeSeriesDataset) None[source]

Fit the scaler to the input time series data.

Parameters:
Return type:

None

transform(data: TimeSeriesDataset) TimeSeriesDataset[source]

Transform the input data.

This method should apply a transformation to the input data and return a new instance.

Parameters:
Returns:

A new instance of the transformed data.

Raises:

NotFittedError – If the transform has not been fitted yet.

Return type:

TimeSeriesDataset

features_added() list[str][source]

List of feature names added by this transform.

Return type:

list[str]

Returns:

A list of strings representing the names of features added to the dataset by this transform. Default is an empty list.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': False, 'extra': 'ignore', 'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].