Scaler#
- class openstef_models.transforms.general.Scaler(**data: Any) None[source]
Bases:
BaseConfig,TimeSeriesTransformTransform that scales time series data using various scikit-learn scaling methods.
Available methods include:
MinMaxScaler: Scales features based on min/max of training set (between 0 and 1).
MaxAbs: Scales features by their maximum absolute value (between -1 and 1).
Standard: Standardizes features by removing the mean and scaling to unit variance (0 mean, 1 std).
Robust: Scales features using statistics that are robust to outliers: median and IQR.
Example
>>> import pandas as pd >>> from datetime import timedelta >>> from openstef_core.datasets import TimeSeriesDataset >>> from openstef_models.transforms.general import Scaler >>> >>> # Create sample data >>> data = pd.DataFrame({ ... 'load': [100, 200, 300], ... 'temperature': [20, 25, 30] ... }, index=pd.date_range('2025-01-01', periods=3, freq='h')) >>> dataset = TimeSeriesDataset(data, timedelta(hours=1)) >>> >>> # Initialize and apply transform >>> scaler = Scaler(method="standard") >>> scaler.fit(dataset) >>> transformed_dataset = scaler.transform(dataset) >>> abs(float(transformed_dataset.data['load'].mean().round(6))) 0.0 >>> # use ddof=0 to get population std (as used by StandardScaler) >>> float(transformed_dataset.data['load'].std(ddof=0).round(6)) 1.0 >>> abs(float(transformed_dataset.data['temperature'].mean().round(6))) 0.0 >>> float(transformed_dataset.data['temperature'].std(ddof=0).round(6)) 1.0
- Parameters:
data (
Any)
-
method:
TypeAliasType
-
selection:
FeatureSelection
- property is_fitted: bool
Check if the transform has been fitted.
- model_post_init(context: Any) None[source]
Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.
- fit(data: TimeSeriesDataset) None[source]
Fit the scaler to the input time series data.
- Parameters:
data (
TimeSeriesDataset) – Time series dataset.data
- Return type:
- transform(data: TimeSeriesDataset) TimeSeriesDataset[source]
Transform the input data.
This method should apply a transformation to the input data and return a new instance.
- Parameters:
data (
TimeSeriesDataset) – The input data to be transformed.data
- Returns:
A new instance of the transformed data.
- Raises:
NotFittedError – If the transform has not been fitted yet.
- Return type:
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': False, 'extra': 'ignore', 'protected_namespaces': ()}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].