data_split#

Time series dataset splitting utilities for training and evaluation.

Provides various strategies for splitting time series datasets into training, validation, and test sets. Supports chronological splits, stratified splits based on extreme values, and custom date-based splits.

Key functions handle the temporal nature of forecasting data, ensuring that training data always precedes test data to prevent information leakage.

Functions#

chronological_train_test_split(dataset, ...)

Split a dataset into train and test sets chronologically.

split_by_date(dataset, split_date)

Split a dataset into train and test sets based on a specific date.

split_by_dates(dataset, dates_test)

Split a dataset into train and test sets based on specific dates.

stratified_train_test_split(dataset, ...[, ...])

Split a dataset into train and test sets with stratification on extreme values.

train_val_test_split(dataset, split_func, ...)

Split a dataset into train, validation, and test sets chronologically.

Classes#

DataSplitter(**data)

Handles splitting of time series data into train, validation, and test sets.