Preprocessing
Most datasets need to be preprocessed/transformed before they can be passed to
the model. This module includes common transformers that are compatible with
sklearn
Pipeline
or ColumnTransformer
.
DateTransformer
Bases: TransformerMixin
, BaseEstimator
Transform date features by deriving useful date/time attributes:
- date attributes:
Year, Month, Week, Day, Dayofweek, Dayofyear, Is_month_end, Is_month_start, Is_quarter_end, Is_quarter_start, Is_year_end, Is_year_start
. - time attributes:
Hour, Minute, Second
.
__init__(date_feats=None, time=False, drop=True)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
date_feats |
Iterable
|
Date features to transform. If None, all features with |
None
|
time |
bool
|
Whether to add time-related derived features such as Hour/Minute/... |
False
|
drop |
bool
|
Whether to drop date features used. |
True
|
fit(X, y=None)
Populate date features if not provided at initialization.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
DataFrame
|
Dataframe that has the date features to transform. |
required |
y |
array | DataFrame | None
|
Included for completeness to be compatible with scikit-learn transformers and pipelines but will not be used. |
None
|
Returns:
Name | Type | Description |
---|---|---|
self |
DateTransformer
|
Fitted date transformer. |
transform(X, y=None)
Derive the date/time attributes for all date features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
DataFrame
|
Dataframe that has the date features to transform. |
required |
y |
array | DataFrame | None
|
Included for completeness to be compatible with scikit-learn transformers and pipelines but will not be used. |
None
|
Returns:
Name | Type | Description |
---|---|---|
X_tr |
DataFrame
|
Dataframe with derived date/time features and NaN indicators. |