DimensionalityReducer#
- class openstef_models.transforms.general.DimensionalityReducer(**data: Any) None[source]
Bases:
BaseConfig,TimeSeriesTransformReduce the dimensionality of a given set of features.
Available methods include:
PCA: linear dimensionality reduction into orthogonal components.
Factor analysis: linear dimensionality reduction models observed variables as latent factors + Gaussian noise.
FastICA: linear dimensionality reduction that maximizes statistical independence among components.
KernelPCA: non-linear dimensionality reduction using rbf kernel.
Example
>>> import pandas as pd >>> from datetime import timedelta >>> from openstef_core.datasets import TimeSeriesDataset >>> from openstef_models.transforms.general import DimensionalityReducer >>> # Create sample dataset >>> data = pd.DataFrame({ ... 'load': [100, 120, 110, 130, 125], ... 'feature1': [1.0, 2.0, 1.5, 2.5, 2.0], ... 'feature2': [1.0, 2.0, 1.5, 2.5, 2.0], ... 'feature3': [5.0, 11.0, 8.0, 2.0, 11.0] ... }, index=pd.date_range('2025-01-01', periods=5, freq='1h')) >>> dataset = TimeSeriesDataset(data, timedelta(hours=1)) >>> # Initialize and apply transform >>> from openstef_models.utils.feature_selection import FeatureSelection >>> dim_reducer = DimensionalityReducer( ... selection=FeatureSelection(include={'feature1', 'feature2', 'feature3'}), ... method="pca", ... n_components=2, ... random_state=1234 ... ) >>> dim_reducer.fit(dataset) >>> transformed_dataset = dim_reducer.transform(dataset) >>> transformed_dataset.data.head().round(3) component_1 component_2 load timestamp 2025-01-01 00:00:00 -2.383 -1.166 100 2025-01-01 01:00:00 3.596 0.335 120 2025-01-01 02:00:00 0.606 -0.416 110 2025-01-01 03:00:00 -5.414 0.912 130 2025-01-01 04:00:00 3.596 0.335 125
- Parameters:
data (
Any)
-
selection:
FeatureSelection
-
method:
Literal['pca','factor_analysis','fastica','kernel_pca']
-
n_components:
int
-
max_iter:
int
- property is_fitted: bool
Check if the transform has been fitted.
- fit(data: TimeSeriesDataset) None[source]
Fit the transform to the input data.
This method should be called before applying the transform to the data. It allows the transform to learn any necessary parameters from the data.
- Parameters:
data (
TimeSeriesDataset) – The input data to fit the transform on.data
- Return type:
- transform(data: TimeSeriesDataset) TimeSeriesDataset[source]
Transform the input data.
This method should apply a transformation to the input data and return a new instance.
- Parameters:
data (
TimeSeriesDataset) – The input data to be transformed.data
- Returns:
A new instance of the transformed data.
- Raises:
NotFittedError – If the transform has not been fitted yet.
- Return type:
- model_post_init(context: Any) None[source]
Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': False, 'extra': 'ignore', 'protected_namespaces': ()}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].