Machine LearningFeature Engineering & Selection

Feature Engineering Mastery

LevelIntermediate

Duration90 mins

TopicFeature Engineering & Selection

4 / 5

Time-Based Features

The Fourth Dimension of Data

Retail sales spike on Black Friday. Traffic peaks at 5 PM. Stock volatility clusters—turbulent days follow turbulent days. Customers who haven't purchased in 90 days are likely churning. Time is the hidden dimension in most datasets, and extracting temporal patterns often provides more predictive power than any other feature engineering technique.

Yet many practitioners treat timestamps as mere IDs—useful for sorting but not for learning. This wastes enormous signal. Every timestamp encodes:

Cyclical patterns: Hour, day-of-week, month, quarter
Trends: Is the metric growing or declining?
Recency: How long since an important event?
Sequences: What happened before, and in what order?
Relationships: How does current behavior compare to past behavior?

What You Will Learn

This page covers the full spectrum of temporal feature engineering: extracting calendar components, encoding cyclical features correctly, computing lag features and rolling statistics, capturing trends and seasonality, and building event-based and relative time features. You'll learn when each technique applies and how to avoid common temporal pitfalls like data leakage.

Calendar and Datetime Features

The simplest temporal features extract human-meaningful components from timestamps. These features capture societal and business rhythms.

Core Calendar Extractions:

Standard Calendar Features
Feature	Values	What It Captures
Hour of day	0-23	Daily activity patterns, work vs. sleep hours
Day of week	0-6 or Mon-Sun	Weekday vs. weekend behavior differences
Day of month	1-31	Pay cycle effects (purchases spike after paydays)
Week of year	1-52	Seasonal patterns at weekly granularity
Month	1-12	Monthly seasonality (December holidays, summer travel)
Quarter	1-4	Business quarter effects (end-of-quarter rushes)
Year	Integer	Long-term trends and year-over-year comparisons
Is weekend	Boolean	Simple weekday/weekend split
Is holiday	Boolean	Requires external holiday calendar
Days to/from holiday	Integer	Pre-holiday buildup, post-holiday lull

calendar_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
import pandas as pd
import numpy as np
from datetime import datetime
 
def extract_calendar_features(df: pd.DataFrame, datetime_col: str) -> pd.DataFrame:
    """
    Extract comprehensive calendar features from a datetime column.
    """
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Basic extractions
    features['hour'] = dt.dt.hour
    features['day_of_week'] = dt.dt.dayofweek  # Monday=0, Sunday=6
    features['day_of_month'] = dt.dt.day
    features['day_of_year'] = dt.dt.dayofyear
    features['week_of_year'] = dt.dt.isocalendar().week.astype(int)
    features['month'] = dt.dt.month
    features['quarter'] = dt.dt.quarter
    features['year'] = dt.dt.year
    
    # Binary indicators
    features['is_weekend'] = (dt.dt.dayofweek >= 5).astype(int)
    features['is_month_start'] = dt.dt.is_month_start.astype(int)
    features['is_month_end'] = dt.dt.is_month_end.astype(int)
    features['is_quarter_start'] = dt.dt.is_quarter_start.astype(int)
    features['is_quarter_end'] = dt.dt.is_quarter_end.astype(int)
    
    # Part of day (categorical or ordinal)
    features['part_of_day'] = pd.cut(
        features['hour'],
        bins=[0, 6, 12, 17, 21, 24],
        labels=['night', 'morning', 'afternoon', 'evening', 'night_late'],
        ordered=True,
        include_lowest=True
    )
    
    # Business hours (9 AM - 5 PM weekdays)
    features['is_business_hours'] = (
        (features['hour'] >= 9) & 
        (features['hour'] < 17) & 
        (features['is_weekend'] == 0)
    ).astype(int)
    
    return features
 
 
def add_holiday_features(
    df: pd.DataFrame, 
    datetime_col: str, 
    country: str = 'US'
) -> pd.DataFrame:
    """
    Add holiday-related features using the holidays library.
    """
    try:
        import holidays
    except ImportError:
        print("Install holidays package: pip install holidays")
        return df
    
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Get holidays for the relevant years
    years = dt.dt.year.unique().tolist()
    country_holidays = holidays.country_holidays(country, years=years)
    
    # Is it a holiday?
    features['is_holiday'] = dt.dt.date.apply(lambda x: x in country_holidays).astype(int)
    
    # Days until next holiday (up to 30 days out)
    def days_to_next_holiday(date):
        for i in range(31):
            check_date = date + pd.Timedelta(days=i)
            if check_date.date() in country_holidays:
                return i
        return 30
    
    features['days_to_holiday'] = dt.apply(days_to_next_holiday)
    
    # Days since last holiday (up to 30 days back)
    def days_since_holiday(date):
        for i in range(31):
            check_date = date - pd.Timedelta(days=i)
            if check_date.date() in country_holidays:
                return i
        return 30
    
    features['days_since_holiday'] = dt.apply(days_since_holiday)
    
    return features
 
 
# Example usage
df = pd.DataFrame({
    'timestamp': pd.date_range('2024-01-01', periods=1000, freq='H'),
    'value': np.random.randn(1000)
})
 
calendar_feats = extract_calendar_features(df, 'timestamp')
print(calendar_feats.head(10))

Time Zone Awareness

Calendar features depend on the time zone used. An event logged at 23:00 UTC might be 15:00 PST—completely different behavioral meaning. Always clarify: is the timestamp in UTC, local user time, or server time? Convert to the time zone that reflects the user's actual experience.

Cyclical Feature Encoding

Calendar features have a fundamental problem: they're cyclical but encoded linearly. Hour 23 is adjacent to hour 0, but as integers they're maximally distant. Sunday (6) is adjacent to Monday (0), but numerically they're far apart.

The Solution: Sine/Cosine Encoding

Map cyclical values to the unit circle using trigonometric functions:

$$x_{sin} = \sin\left(\frac{2\pi \cdot x}{max_value}\right)$$ $$x_{cos} = \cos\left(\frac{2\pi \cdot x}{max_value}\right)$$

This creates two features that jointly encode position on the cycle. Adjacent positions have similar (sin, cos) values, even across the cycle boundary.

Cyclical Encoding Examples
Feature	Max Value	Before (Linear)	After (sin, cos)
Hour 0	24	0	(0.00, 1.00)
Hour 6	24	6	(1.00, 0.00)
Hour 12	24	12	(0.00, -1.00)
Hour 23	24	23	(-0.26, 0.97)
Monday (0)	7	0	(0.00, 1.00)
Sunday (6)	7	6	(-0.78, 0.62)

cyclical_encoding.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
import pandas as pd
import numpy as np
 
def encode_cyclical(
    series: pd.Series, 
    max_value: float,
    feature_name: str = None
) -> pd.DataFrame:
    """
    Encode a cyclical feature using sin/cos transformation.
    
    Parameters:
    -----------
    series: The cyclical values (e.g., hour 0-23, day 0-6)
    max_value: The period of the cycle (24 for hours, 7 for days)
    """
    name = feature_name or series.name or 'cyclical'
    
    normalized = 2 * np.pi * series / max_value
    
    return pd.DataFrame({
        f'{name}_sin': np.sin(normalized),
        f'{name}_cos': np.cos(normalized)
    })
 
 
def encode_all_cyclical_features(df: pd.DataFrame, datetime_col: str) -> pd.DataFrame:
    """
    Encode all standard cyclical datetime components.
    """
    cyclical = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Hour of day (period = 24)
    hour_enc = encode_cyclical(dt.dt.hour, 24, 'hour')
    cyclical = pd.concat([cyclical, hour_enc], axis=1)
    
    # Day of week (period = 7)
    dow_enc = encode_cyclical(dt.dt.dayofweek, 7, 'day_of_week')
    cyclical = pd.concat([cyclical, dow_enc], axis=1)
    
    # Day of month (period = 31, approximate)
    dom_enc = encode_cyclical(dt.dt.day, 31, 'day_of_month')
    cyclical = pd.concat([cyclical, dom_enc], axis=1)
    
    # Day of year (period = 365)
    doy_enc = encode_cyclical(dt.dt.dayofyear, 365, 'day_of_year')
    cyclical = pd.concat([cyclical, doy_enc], axis=1)
    
    # Month (period = 12)
    month_enc = encode_cyclical(dt.dt.month, 12, 'month')
    cyclical = pd.concat([cyclical, month_enc], axis=1)
    
    return cyclical
 
 
# Visualize why cyclical encoding matters
def demonstrate_cyclical_benefit():
    """
    Show how cyclical encoding preserves adjacency.
    """
    hours = np.arange(24)
    
    # Linear distance: hour 23 to hour 0
    linear_23_to_0 = abs(23 - 0)  # = 23
    
    # Cyclical distance
    h23_sin, h23_cos = np.sin(2*np.pi*23/24), np.cos(2*np.pi*23/24)
    h0_sin, h0_cos = np.sin(2*np.pi*0/24), np.cos(2*np.pi*0/24)
    
    cyclical_23_to_0 = np.sqrt((h23_sin - h0_sin)**2 + (h23_cos - h0_cos)**2)
    
    # Compare to adjacent hours
    h1_sin, h1_cos = np.sin(2*np.pi*1/24), np.cos(2*np.pi*1/24)
    cyclical_0_to_1 = np.sqrt((h0_sin - h1_sin)**2 + (h0_cos - h1_cos)**2)
    
    print(f"Linear distance 23→0: {linear_23_to_0}")
    print(f"Cyclical distance 23→0: {cyclical_23_to_0:.3f}")
    print(f"Cyclical distance 0→1: {cyclical_0_to_1:.3f}")
    print(f"Adjacent hours have similar distance in cyclical encoding!")
 
demonstrate_cyclical_benefit()

When to Use Cyclical vs. Categorical Encoding

Cyclical encoding works best for tree-insensitive models (linear, SVM, neural nets) where numeric distance matters. Tree-based models can handle linear integers fine—they'll learn splits that effectively partition the cycle. For trees, cyclical encoding adds little value but doesn't hurt. When in doubt, include both representations.

Lag Features and Autoregressive Patterns

Lag features use past values to predict current/future values. They capture autocorrelation—the tendency for values to depend on previous observations.

Core Lag Concepts:

Lag Type	Description	Example
Simple lag	Value at t-k	Yesterday's sales
Multiple lags	Values at t-1, t-2, ..., t-k	Past 7 days' sales
Difference	Value(t) - Value(t-k)	Change since yesterday
Percentage change	(Value(t) - Value(t-k)) / Value(t-k)	Growth rate
Lag of target	Previous target values	Time series forecasting

lag_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
import pandas as pd
import numpy as np
 
def create_lag_features(
    df: pd.DataFrame,
    value_col: str,
    lags: list,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create lag features with optional grouping.
    
    Parameters:
    -----------
    lags: List of lag periods [1, 7, 30] means 1-day, 7-day, 30-day lags
    group_col: Column to group by (e.g., 'user_id', 'product_id')
    """
    features = pd.DataFrame(index=df.index)
    
    for lag in lags:
        col_name = f'{value_col}_lag_{lag}'
        
        if group_col:
            features[col_name] = df.groupby(group_col)[value_col].shift(lag)
        else:
            features[col_name] = df[value_col].shift(lag)
    
    return features
 
 
def create_difference_features(
    df: pd.DataFrame,
    value_col: str,
    periods: list,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create difference and percentage change features.
    """
    features = pd.DataFrame(index=df.index)
    
    for period in periods:
        if group_col:
            lagged = df.groupby(group_col)[value_col].shift(period)
        else:
            lagged = df[value_col].shift(period)
        
        # Absolute difference
        features[f'{value_col}_diff_{period}'] = df[value_col] - lagged
        
        # Percentage change
        features[f'{value_col}_pct_change_{period}'] = (
            (df[value_col] - lagged) / (lagged + 1e-8)
        )
    
    return features
 
 
def create_multi_target_lags(
    df: pd.DataFrame,
    target_col: str,
    feature_lags: dict,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create lags for multiple columns with different lag periods.
    
    Parameters:
    -----------
    feature_lags: Dict mapping column names to lag periods
                  {'sales': [1, 7, 30], 'clicks': [1, 2, 3]}
    """
    all_features = pd.DataFrame(index=df.index)
    
    for col, lags in feature_lags.items():
        lag_feats = create_lag_features(df, col, lags, group_col)
        all_features = pd.concat([all_features, lag_feats], axis=1)
    
    return all_features
 
 
# Example: E-commerce user behavior lags
def user_behavior_lags(
    transactions: pd.DataFrame,
    user_col: str = 'user_id',
    date_col: str = 'date'
) -> pd.DataFrame:
    """
    Create user-level behavioral lag features.
    """
    # Sort by user and date
    df = transactions.sort_values([user_col, date_col])
    
    features = pd.DataFrame(index=df.index)
    
    # Days since last purchase
    features['days_since_last_purchase'] = df.groupby(user_col)[date_col].diff().dt.days
    
    # Previous order value
    features['prev_order_value'] = df.groupby(user_col)['order_value'].shift(1)
    
    # Running count of orders
    features['order_number'] = df.groupby(user_col).cumcount() + 1
    
    # Compare to previous
    features['order_value_vs_prev'] = (
        df['order_value'] / (features['prev_order_value'] + 1)
    )
    
    # Was previous order returned?
    features['prev_was_returned'] = df.groupby(user_col)['was_returned'].shift(1)
    
    return features

Lag Features and Data Leakage

Using lagged TARGET values is valid in forecasting but dangerous in classification. If you're predicting whether a user will churn this month, using 'churned_last_month' leaks information if it was only recorded AFTER the prediction point. Always verify: 'Would this lag value be available at prediction time in production?'

Rolling Window Statistics

While lag features capture specific past values, rolling statistics summarize recent behavior patterns. They're more robust than individual lags—smoothing out noise while preserving signal.

Common Rolling Statistics:

Rolling Window Features

•Rolling Mean — Average over last k periods. Smooths noise, captures baseline level.
•Rolling Std/Variance — Volatility over last k periods. High variance signals instability.
•Rolling Min/Max — Range extremes. Useful for detecting anomalies or bounds.
•Rolling Sum — Total over period. Useful for cumulative metrics (total spend, total clicks).
•Rolling Median — Robust central tendency. Less sensitive to outliers than mean.
•Rolling Quantiles — 25th, 75th percentiles. Characterize distribution shape over time.
•Exponentially Weighted Mean (EWMA) — Recent values weighted more heavily. Captures trends.

rolling_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
import pandas as pd
import numpy as np
 
def create_rolling_features(
    df: pd.DataFrame,
    value_col: str,
    windows: list,
    group_col: str = None,
    min_periods: int = 1
) -> pd.DataFrame:
    """
    Create comprehensive rolling window features.
    
    Parameters:
    -----------
    windows: List of window sizes [7, 14, 30]
    group_col: Column to group by (for per-entity rolling)
    min_periods: Minimum observations required
    """
    features = pd.DataFrame(index=df.index)
    
    for window in windows:
        if group_col:
            rolling = df.groupby(group_col)[value_col].rolling(
                window=window, min_periods=min_periods
            )
        else:
            rolling = df[value_col].rolling(window=window, min_periods=min_periods)
        
        # Central tendency
        features[f'{value_col}_roll_mean_{window}'] = rolling.mean().reset_index(level=0, drop=True) if group_col else rolling.mean()
        features[f'{value_col}_roll_median_{window}'] = rolling.median().reset_index(level=0, drop=True) if group_col else rolling.median()
        
        # Dispersion
        features[f'{value_col}_roll_std_{window}'] = rolling.std().reset_index(level=0, drop=True) if group_col else rolling.std()
        
        # Range
        features[f'{value_col}_roll_min_{window}'] = rolling.min().reset_index(level=0, drop=True) if group_col else rolling.min()
        features[f'{value_col}_roll_max_{window}'] = rolling.max().reset_index(level=0, drop=True) if group_col else rolling.max()
        
        # Sum (for cumulative metrics)
        features[f'{value_col}_roll_sum_{window}'] = rolling.sum().reset_index(level=0, drop=True) if group_col else rolling.sum()
        
        # Current value relative to rolling mean
        roll_mean = features[f'{value_col}_roll_mean_{window}']
        features[f'{value_col}_vs_roll_mean_{window}'] = df[value_col] / (roll_mean + 1e-8)
        
        # Z-score within rolling window
        roll_std = features[f'{value_col}_roll_std_{window}']
        features[f'{value_col}_roll_zscore_{window}'] = (
            (df[value_col] - roll_mean) / (roll_std + 1e-8)
        )
    
    return features
 
 
def create_ewm_features(
    df: pd.DataFrame,
    value_col: str,
    spans: list,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create Exponentially Weighted Moving Average features.
    
    EWMA gives more weight to recent observations.
    span=7 means the decay factor is 2/(7+1) = 0.25
    """
    features = pd.DataFrame(index=df.index)
    
    for span in spans:
        if group_col:
            ewm = df.groupby(group_col)[value_col].ewm(span=span, adjust=False)
        else:
            ewm = df[value_col].ewm(span=span, adjust=False)
        
        # EWMA mean
        features[f'{value_col}_ewm_mean_{span}'] = ewm.mean().reset_index(level=0, drop=True) if group_col else ewm.mean()
        
        # EWMA std
        features[f'{value_col}_ewm_std_{span}'] = ewm.std().reset_index(level=0, drop=True) if group_col else ewm.std()
    
    return features
 
 
def create_expanding_features(
    df: pd.DataFrame,
    value_col: str,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create expanding window features (all history up to current point).
    """
    features = pd.DataFrame(index=df.index)
    
    if group_col:
        expanding = df.groupby(group_col)[value_col].expanding()
    else:
        expanding = df[value_col].expanding()
    
    # Historical statistics
    features[f'{value_col}_exp_mean'] = expanding.mean().reset_index(level=0, drop=True) if group_col else expanding.mean()
    features[f'{value_col}_exp_std'] = expanding.std().reset_index(level=0, drop=True) if group_col else expanding.std()
    features[f'{value_col}_exp_min'] = expanding.min().reset_index(level=0, drop=True) if group_col else expanding.min()
    features[f'{value_col}_exp_max'] = expanding.max().reset_index(level=0, drop=True) if group_col else expanding.max()
    
    # Current value percentile in history
    features[f'{value_col}_exp_rank'] = (
        df.groupby(group_col)[value_col].rank(pct=True) 
        if group_col else df[value_col].rank(pct=True)
    )
    
    return features
 
 
# Example usage
df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=100, freq='D'),
    'user_id': np.repeat(['A', 'B'], 50),
    'purchase_amount': np.random.exponential(100, 100)
})
df = df.sort_values(['user_id', 'date'])
 
# Create per-user rolling features
rolling_feats = create_rolling_features(
    df, 'purchase_amount', 
    windows=[7, 14, 30], 
    group_col='user_id'
)
print(rolling_feats.head(20))

Choosing Window Sizes

Window sizes should reflect domain-meaningful periods: 7 days for weekly patterns, 30 days for monthly, 90 days for quarterly. Try multiple windows—short windows (7 days) capture recent changes while long windows (90 days) capture stable baselines. The ratio of current value to different window means (short vs. long) can indicate trend direction.

Trend and Seasonality Features

Lag features and rolling statistics capture patterns implicitly. Sometimes we want to explicitly model trend and seasonality components.

Trend Features:

Trend captures the long-term direction of a time series—is it growing, declining, or flat?

trend_seasonality.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
import pandas as pd
import numpy as np
from scipy import stats
 
def compute_trend_features(
    df: pd.DataFrame,
    value_col: str,
    window: int,
    group_col: str = None
) -> pd.DataFrame:
    """
    Compute trend features using linear regression over rolling windows.
    """
    features = pd.DataFrame(index=df.index)
    
    def rolling_slope(x):
        """Compute slope of linear regression over window."""
        if len(x) < 2 or x.isna().any():
            return np.nan
        try:
            slope, _, _, _, _ = stats.linregress(range(len(x)), x)
            return slope
        except:
            return np.nan
    
    if group_col:
        features[f'{value_col}_trend_slope_{window}'] = (
            df.groupby(group_col)[value_col]
            .rolling(window=window, min_periods=window//2)
            .apply(rolling_slope, raw=False)
            .reset_index(level=0, drop=True)
        )
    else:
        features[f'{value_col}_trend_slope_{window}'] = (
            df[value_col]
            .rolling(window=window, min_periods=window//2)
            .apply(rolling_slope, raw=False)
        )
    
    # Trend direction as categorical
    features[f'{value_col}_trend_direction_{window}'] = np.where(
        features[f'{value_col}_trend_slope_{window}'] > 0.05, 'up',
        np.where(features[f'{value_col}_trend_slope_{window}'] < -0.05, 'down', 'flat')
    )
    
    return features
 
 
def decompose_time_series(
    df: pd.DataFrame,
    value_col: str,
    datetime_col: str,
    period: int = None
) -> pd.DataFrame:
    """
    Decompose time series into trend, seasonal, and residual components.
    Uses STL decomposition.
    """
    from statsmodels.tsa.seasonal import STL
    
    # Prepare time series
    ts = df.set_index(datetime_col)[value_col]
    ts = ts.asfreq('D')  # Assume daily; adjust as needed
    ts = ts.interpolate()  # Fill gaps
    
    # Determine period if not specified
    if period is None:
        period = 7  # Weekly seasonality
    
    # STL decomposition
    stl = STL(ts, period=period, robust=True)
    result = stl.fit()
    
    # Create feature DataFrame
    decomposed = pd.DataFrame({
        f'{value_col}_trend': result.trend,
        f'{value_col}_seasonal': result.seasonal,
        f'{value_col}_residual': result.resid,
        f'{value_col}_trend_strength': 1 - (result.resid.var() / (result.trend + result.resid).var()),
        f'{value_col}_seasonal_strength': 1 - (result.resid.var() / (result.seasonal + result.resid).var()),
    })
    
    return decomposed
 
 
def create_seasonal_features(
    df: pd.DataFrame,
    value_col: str,
    datetime_col: str,
    periods: dict = None
) -> pd.DataFrame:
    """
    Create features capturing seasonal patterns.
    
    Parameters:
    -----------
    periods: Dict of period names to period lengths
             {'weekly': 7, 'monthly': 30, 'yearly': 365}
    """
    if periods is None:
        periods = {'weekly': 7, 'monthly': 30}
    
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    for name, period in periods.items():
        # Seasonal index position
        if name == 'weekly':
            position = dt.dt.dayofweek / 7
        elif name == 'monthly':
            position = dt.dt.day / 31
        elif name == 'yearly':
            position = dt.dt.dayofyear / 365
        else:
            position = (dt - dt.min()).dt.days % period / period
        
        # Fourier components for smooth seasonal curves
        for k in range(1, 4):  # First 3 harmonics
            features[f'{name}_sin_{k}'] = np.sin(2 * np.pi * k * position)
            features[f'{name}_cos_{k}'] = np.cos(2 * np.pi * k * position)
    
    return features
 
 
# Momentum features (rate of change indicators)
def create_momentum_features(
    df: pd.DataFrame,
    value_col: str,
    periods: list = [7, 14, 30]
) -> pd.DataFrame:
    """
    Create momentum/rate-of-change features.
    """
    features = pd.DataFrame(index=df.index)
    
    for period in periods:
        # Simple rate of change
        features[f'{value_col}_roc_{period}'] = df[value_col].pct_change(periods=period)
        
        # Rate of change of rate of change (acceleration)
        features[f'{value_col}_roc_roc_{period}'] = features[f'{value_col}_roc_{period}'].diff()
        
        # Momentum (current / past)
        features[f'{value_col}_momentum_{period}'] = df[value_col] / df[value_col].shift(period)
    
    # Short vs Long momentum (trend strength)
    if len(periods) >= 2:
        short, long = periods[0], periods[-1]
        features[f'{value_col}_momentum_diff'] = (
            features[f'{value_col}_momentum_{short}'] - 
            features[f'{value_col}_momentum_{long}']
        )
    
    return features

Event-Based Time Features

Beyond calendar cycles, events create temporal structure. User behaviors are anchored to:

Account creation → 'Days since signup'
Last purchase → 'Days since last order'
Support contact → 'Days since last complaint'
Marketing email → 'Hours since last email'

These relative time features often outperform absolute calendar features because they capture user-specific temporal context.

event_based_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
import pandas as pd
import numpy as np
 
def create_recency_features(
    df: pd.DataFrame,
    entity_col: str,
    event_col: str,
    datetime_col: str,
    ref_datetime: pd.Timestamp = None
) -> pd.DataFrame:
    """
    Create recency features (time since last event).
    """
    if ref_datetime is None:
        ref_datetime = pd.Timestamp.now()
    
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Time since event
    days_since = (ref_datetime - dt).dt.days
    features[f'days_since_{event_col}'] = days_since
    
    # Binned recency
    features[f'recency_bucket_{event_col}'] = pd.cut(
        days_since,
        bins=[0, 7, 30, 90, 180, 365, np.inf],
        labels=['<1w', '1w-1m', '1m-3m', '3m-6m', '6m-1y', '>1y']
    )
    
    # Log transform (diminishing recency importance)
    features[f'log_days_since_{event_col}'] = np.log1p(days_since)
    
    # Decay features (exponential decay of event importance)
    for half_life in [7, 30, 90]:
        decay = np.exp(-days_since * np.log(2) / half_life)
        features[f'{event_col}_decay_{half_life}d'] = decay
    
    return features
 
 
def create_inter_event_features(
    df: pd.DataFrame,
    entity_col: str,
    datetime_col: str
) -> pd.DataFrame:
    """
    Create features about time between events for each entity.
    """
    # Sort by entity and time
    df_sorted = df.sort_values([entity_col, datetime_col])
    
    features = pd.DataFrame(index=df_sorted.index)
    dt = pd.to_datetime(df_sorted[datetime_col])
    
    # Time since previous event (for this entity)
    features['days_since_prev_event'] = (
        dt - df_sorted.groupby(entity_col)[datetime_col].shift(1)
    ).dt.days
    
    # Time until next event
    features['days_until_next_event'] = (
        df_sorted.groupby(entity_col)[datetime_col].shift(-1) - dt
    ).dt.days
    
    # Event frequency (rolling count)
    features['events_last_30d'] = (
        df_sorted.groupby(entity_col)
        .rolling('30D', on=datetime_col)
        .size()
        .reset_index(level=0, drop=True)
    )
    
    # Regularity (std of inter-event times)
    features['inter_event_std_30d'] = (
        df_sorted.groupby(entity_col)['days_since_prev_event']
        .rolling(window=30, min_periods=3)
        .std()
        .reset_index(level=0, drop=True)
    )
    
    # Is accelerating? (current gap < average gap)
    avg_gap = df_sorted.groupby(entity_col)['days_since_prev_event'].transform('mean')
    features['is_accelerating'] = (features['days_since_prev_event'] < avg_gap).astype(int)
    
    return features
 
 
def create_lifecycle_features(
    df: pd.DataFrame,
    entity_col: str,
    event_datetime_col: str,
    signup_datetime_col: str
) -> pd.DataFrame:
    """
    Create lifecycle/tenure features.
    """
    features = pd.DataFrame(index=df.index)
    
    event_dt = pd.to_datetime(df[event_datetime_col])
    signup_dt = pd.to_datetime(df[signup_datetime_col])
    
    # Tenure at time of event
    features['tenure_days'] = (event_dt - signup_dt).dt.days
    features['tenure_weeks'] = features['tenure_days'] / 7
    features['tenure_months'] = features['tenure_days'] / 30
    
    # Lifecycle stage
    features['lifecycle_stage'] = pd.cut(
        features['tenure_days'],
        bins=[0, 7, 30, 90, 365, np.inf],
        labels=['new', 'activated', 'engaged', 'mature', 'veteran']
    )
    
    # Event position in user's history
    features['event_number'] = df.groupby(entity_col).cumcount() + 1
    
    # Event rate (events per tenure day)
    features['event_rate'] = features['event_number'] / (features['tenure_days'] + 1)
    
    return features

Decay Functions Encode Time Importance

Exponential decay features (e^(-t/half_life)) encode the intuition that recent events matter more than distant ones. A purchase yesterday has decay ~1.0; a purchase 30 days ago with half_life=7 has decay ~0.02. Different half-lives capture different assumptions about how quickly relevance fades.

Summary: Mastering Temporal Features

Time-based features often provide the single largest lift in predictive performance. They capture the rhythms of human behavior, business cycles, and natural phenomena. Here are the key insights:

Key Takeaways

•Calendar features extract human-meaningful components (hour, day, month). Include business-relevant flags (business hours, holidays, weekends).
•Cyclical encoding using sin/cos preserves adjacency across cycle boundaries—critical for distance-based algorithms.
•Lag features capture autocorrelation by using past values to predict present. Be wary of data leakage with target lags.
•Rolling statistics summarize recent behavior patterns (mean, std, min, max). Multiple window sizes capture both short-term changes and long-term baselines.
•Trend features explicitly model direction (slope) over time. Seasonality features capture repeating patterns at various frequencies.
•Event-based features (recency, inter-event times, lifecycle) often outperform calendar features by capturing user-specific temporal context.
•Always verify temporal validity: Would this feature be available at prediction time in production?

Page Complete

You now have a comprehensive toolkit for temporal feature engineering—from basic calendar extraction to sophisticated trend and event-based features. Next, we'll explore how to engineer features from text and categorical data, completing our tour of feature type-specific techniques.

4 / 5

Loading learning content...

Machine LearningFeature Engineering & Selection

Feature Engineering Mastery

LevelIntermediate

Duration90 mins

TopicFeature Engineering & Selection

4 / 5

Time-Based Features

The Fourth Dimension of Data

Yet many practitioners treat timestamps as mere IDs—useful for sorting but not for learning. This wastes enormous signal. Every timestamp encodes:

Cyclical patterns: Hour, day-of-week, month, quarter
Trends: Is the metric growing or declining?
Recency: How long since an important event?
Sequences: What happened before, and in what order?
Relationships: How does current behavior compare to past behavior?

What You Will Learn

Calendar and Datetime Features

The simplest temporal features extract human-meaningful components from timestamps. These features capture societal and business rhythms.

Core Calendar Extractions:

Standard Calendar Features
Feature	Values	What It Captures
Hour of day	0-23	Daily activity patterns, work vs. sleep hours
Day of week	0-6 or Mon-Sun	Weekday vs. weekend behavior differences
Day of month	1-31	Pay cycle effects (purchases spike after paydays)
Week of year	1-52	Seasonal patterns at weekly granularity
Month	1-12	Monthly seasonality (December holidays, summer travel)
Quarter	1-4	Business quarter effects (end-of-quarter rushes)
Year	Integer	Long-term trends and year-over-year comparisons
Is weekend	Boolean	Simple weekday/weekend split
Is holiday	Boolean	Requires external holiday calendar
Days to/from holiday	Integer	Pre-holiday buildup, post-holiday lull

calendar_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
import pandas as pd
import numpy as np
from datetime import datetime
 
def extract_calendar_features(df: pd.DataFrame, datetime_col: str) -> pd.DataFrame:
    """
    Extract comprehensive calendar features from a datetime column.
    """
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Basic extractions
    features['hour'] = dt.dt.hour
    features['day_of_week'] = dt.dt.dayofweek  # Monday=0, Sunday=6
    features['day_of_month'] = dt.dt.day
    features['day_of_year'] = dt.dt.dayofyear
    features['week_of_year'] = dt.dt.isocalendar().week.astype(int)
    features['month'] = dt.dt.month
    features['quarter'] = dt.dt.quarter
    features['year'] = dt.dt.year
    
    # Binary indicators
    features['is_weekend'] = (dt.dt.dayofweek >= 5).astype(int)
    features['is_month_start'] = dt.dt.is_month_start.astype(int)
    features['is_month_end'] = dt.dt.is_month_end.astype(int)
    features['is_quarter_start'] = dt.dt.is_quarter_start.astype(int)
    features['is_quarter_end'] = dt.dt.is_quarter_end.astype(int)
    
    # Part of day (categorical or ordinal)
    features['part_of_day'] = pd.cut(
        features['hour'],
        bins=[0, 6, 12, 17, 21, 24],
        labels=['night', 'morning', 'afternoon', 'evening', 'night_late'],
        ordered=True,
        include_lowest=True
    )
    
    # Business hours (9 AM - 5 PM weekdays)
    features['is_business_hours'] = (
        (features['hour'] >= 9) & 
        (features['hour'] < 17) & 
        (features['is_weekend'] == 0)
    ).astype(int)
    
    return features
 
 
def add_holiday_features(
    df: pd.DataFrame, 
    datetime_col: str, 
    country: str = 'US'
) -> pd.DataFrame:
    """
    Add holiday-related features using the holidays library.
    """
    try:
        import holidays
    except ImportError:
        print("Install holidays package: pip install holidays")
        return df
    
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Get holidays for the relevant years
    years = dt.dt.year.unique().tolist()
    country_holidays = holidays.country_holidays(country, years=years)
    
    # Is it a holiday?
    features['is_holiday'] = dt.dt.date.apply(lambda x: x in country_holidays).astype(int)
    
    # Days until next holiday (up to 30 days out)
    def days_to_next_holiday(date):
        for i in range(31):
            check_date = date + pd.Timedelta(days=i)
            if check_date.date() in country_holidays:
                return i
        return 30
    
    features['days_to_holiday'] = dt.apply(days_to_next_holiday)
    
    # Days since last holiday (up to 30 days back)
    def days_since_holiday(date):
        for i in range(31):
            check_date = date - pd.Timedelta(days=i)
            if check_date.date() in country_holidays:
                return i
        return 30
    
    features['days_since_holiday'] = dt.apply(days_since_holiday)
    
    return features
 
 
# Example usage
df = pd.DataFrame({
    'timestamp': pd.date_range('2024-01-01', periods=1000, freq='H'),
    'value': np.random.randn(1000)
})
 
calendar_feats = extract_calendar_features(df, 'timestamp')
print(calendar_feats.head(10))

Time Zone Awareness

Cyclical Feature Encoding

The Solution: Sine/Cosine Encoding

Map cyclical values to the unit circle using trigonometric functions:

$$x_{sin} = \sin\left(\frac{2\pi \cdot x}{max_value}\right)$$ $$x_{cos} = \cos\left(\frac{2\pi \cdot x}{max_value}\right)$$

This creates two features that jointly encode position on the cycle. Adjacent positions have similar (sin, cos) values, even across the cycle boundary.

Cyclical Encoding Examples
Feature	Max Value	Before (Linear)	After (sin, cos)
Hour 0	24	0	(0.00, 1.00)
Hour 6	24	6	(1.00, 0.00)
Hour 12	24	12	(0.00, -1.00)
Hour 23	24	23	(-0.26, 0.97)
Monday (0)	7	0	(0.00, 1.00)
Sunday (6)	7	6	(-0.78, 0.62)

cyclical_encoding.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
import pandas as pd
import numpy as np
 
def encode_cyclical(
    series: pd.Series, 
    max_value: float,
    feature_name: str = None
) -> pd.DataFrame:
    """
    Encode a cyclical feature using sin/cos transformation.
    
    Parameters:
    -----------
    series: The cyclical values (e.g., hour 0-23, day 0-6)
    max_value: The period of the cycle (24 for hours, 7 for days)
    """
    name = feature_name or series.name or 'cyclical'
    
    normalized = 2 * np.pi * series / max_value
    
    return pd.DataFrame({
        f'{name}_sin': np.sin(normalized),
        f'{name}_cos': np.cos(normalized)
    })
 
 
def encode_all_cyclical_features(df: pd.DataFrame, datetime_col: str) -> pd.DataFrame:
    """
    Encode all standard cyclical datetime components.
    """
    cyclical = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Hour of day (period = 24)
    hour_enc = encode_cyclical(dt.dt.hour, 24, 'hour')
    cyclical = pd.concat([cyclical, hour_enc], axis=1)
    
    # Day of week (period = 7)
    dow_enc = encode_cyclical(dt.dt.dayofweek, 7, 'day_of_week')
    cyclical = pd.concat([cyclical, dow_enc], axis=1)
    
    # Day of month (period = 31, approximate)
    dom_enc = encode_cyclical(dt.dt.day, 31, 'day_of_month')
    cyclical = pd.concat([cyclical, dom_enc], axis=1)
    
    # Day of year (period = 365)
    doy_enc = encode_cyclical(dt.dt.dayofyear, 365, 'day_of_year')
    cyclical = pd.concat([cyclical, doy_enc], axis=1)
    
    # Month (period = 12)
    month_enc = encode_cyclical(dt.dt.month, 12, 'month')
    cyclical = pd.concat([cyclical, month_enc], axis=1)
    
    return cyclical
 
 
# Visualize why cyclical encoding matters
def demonstrate_cyclical_benefit():
    """
    Show how cyclical encoding preserves adjacency.
    """
    hours = np.arange(24)
    
    # Linear distance: hour 23 to hour 0
    linear_23_to_0 = abs(23 - 0)  # = 23
    
    # Cyclical distance
    h23_sin, h23_cos = np.sin(2*np.pi*23/24), np.cos(2*np.pi*23/24)
    h0_sin, h0_cos = np.sin(2*np.pi*0/24), np.cos(2*np.pi*0/24)
    
    cyclical_23_to_0 = np.sqrt((h23_sin - h0_sin)**2 + (h23_cos - h0_cos)**2)
    
    # Compare to adjacent hours
    h1_sin, h1_cos = np.sin(2*np.pi*1/24), np.cos(2*np.pi*1/24)
    cyclical_0_to_1 = np.sqrt((h0_sin - h1_sin)**2 + (h0_cos - h1_cos)**2)
    
    print(f"Linear distance 23→0: {linear_23_to_0}")
    print(f"Cyclical distance 23→0: {cyclical_23_to_0:.3f}")
    print(f"Cyclical distance 0→1: {cyclical_0_to_1:.3f}")
    print(f"Adjacent hours have similar distance in cyclical encoding!")
 
demonstrate_cyclical_benefit()

When to Use Cyclical vs. Categorical Encoding

Lag Features and Autoregressive Patterns

Lag features use past values to predict current/future values. They capture autocorrelation—the tendency for values to depend on previous observations.

Core Lag Concepts:

Lag Type	Description	Example
Simple lag	Value at t-k	Yesterday's sales
Multiple lags	Values at t-1, t-2, ..., t-k	Past 7 days' sales
Difference	Value(t) - Value(t-k)	Change since yesterday
Percentage change	(Value(t) - Value(t-k)) / Value(t-k)	Growth rate
Lag of target	Previous target values	Time series forecasting

lag_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
import pandas as pd
import numpy as np
 
def create_lag_features(
    df: pd.DataFrame,
    value_col: str,
    lags: list,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create lag features with optional grouping.
    
    Parameters:
    -----------
    lags: List of lag periods [1, 7, 30] means 1-day, 7-day, 30-day lags
    group_col: Column to group by (e.g., 'user_id', 'product_id')
    """
    features = pd.DataFrame(index=df.index)
    
    for lag in lags:
        col_name = f'{value_col}_lag_{lag}'
        
        if group_col:
            features[col_name] = df.groupby(group_col)[value_col].shift(lag)
        else:
            features[col_name] = df[value_col].shift(lag)
    
    return features
 
 
def create_difference_features(
    df: pd.DataFrame,
    value_col: str,
    periods: list,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create difference and percentage change features.
    """
    features = pd.DataFrame(index=df.index)
    
    for period in periods:
        if group_col:
            lagged = df.groupby(group_col)[value_col].shift(period)
        else:
            lagged = df[value_col].shift(period)
        
        # Absolute difference
        features[f'{value_col}_diff_{period}'] = df[value_col] - lagged
        
        # Percentage change
        features[f'{value_col}_pct_change_{period}'] = (
            (df[value_col] - lagged) / (lagged + 1e-8)
        )
    
    return features
 
 
def create_multi_target_lags(
    df: pd.DataFrame,
    target_col: str,
    feature_lags: dict,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create lags for multiple columns with different lag periods.
    
    Parameters:
    -----------
    feature_lags: Dict mapping column names to lag periods
                  {'sales': [1, 7, 30], 'clicks': [1, 2, 3]}
    """
    all_features = pd.DataFrame(index=df.index)
    
    for col, lags in feature_lags.items():
        lag_feats = create_lag_features(df, col, lags, group_col)
        all_features = pd.concat([all_features, lag_feats], axis=1)
    
    return all_features
 
 
# Example: E-commerce user behavior lags
def user_behavior_lags(
    transactions: pd.DataFrame,
    user_col: str = 'user_id',
    date_col: str = 'date'
) -> pd.DataFrame:
    """
    Create user-level behavioral lag features.
    """
    # Sort by user and date
    df = transactions.sort_values([user_col, date_col])
    
    features = pd.DataFrame(index=df.index)
    
    # Days since last purchase
    features['days_since_last_purchase'] = df.groupby(user_col)[date_col].diff().dt.days
    
    # Previous order value
    features['prev_order_value'] = df.groupby(user_col)['order_value'].shift(1)
    
    # Running count of orders
    features['order_number'] = df.groupby(user_col).cumcount() + 1
    
    # Compare to previous
    features['order_value_vs_prev'] = (
        df['order_value'] / (features['prev_order_value'] + 1)
    )
    
    # Was previous order returned?
    features['prev_was_returned'] = df.groupby(user_col)['was_returned'].shift(1)
    
    return features

Lag Features and Data Leakage

Rolling Window Statistics

While lag features capture specific past values, rolling statistics summarize recent behavior patterns. They're more robust than individual lags—smoothing out noise while preserving signal.

Common Rolling Statistics:

Rolling Window Features

•Rolling Mean — Average over last k periods. Smooths noise, captures baseline level.
•Rolling Std/Variance — Volatility over last k periods. High variance signals instability.
•Rolling Min/Max — Range extremes. Useful for detecting anomalies or bounds.
•Rolling Sum — Total over period. Useful for cumulative metrics (total spend, total clicks).
•Rolling Median — Robust central tendency. Less sensitive to outliers than mean.
•Rolling Quantiles — 25th, 75th percentiles. Characterize distribution shape over time.
•Exponentially Weighted Mean (EWMA) — Recent values weighted more heavily. Captures trends.

rolling_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
import pandas as pd
import numpy as np
 
def create_rolling_features(
    df: pd.DataFrame,
    value_col: str,
    windows: list,
    group_col: str = None,
    min_periods: int = 1
) -> pd.DataFrame:
    """
    Create comprehensive rolling window features.
    
    Parameters:
    -----------
    windows: List of window sizes [7, 14, 30]
    group_col: Column to group by (for per-entity rolling)
    min_periods: Minimum observations required
    """
    features = pd.DataFrame(index=df.index)
    
    for window in windows:
        if group_col:
            rolling = df.groupby(group_col)[value_col].rolling(
                window=window, min_periods=min_periods
            )
        else:
            rolling = df[value_col].rolling(window=window, min_periods=min_periods)
        
        # Central tendency
        features[f'{value_col}_roll_mean_{window}'] = rolling.mean().reset_index(level=0, drop=True) if group_col else rolling.mean()
        features[f'{value_col}_roll_median_{window}'] = rolling.median().reset_index(level=0, drop=True) if group_col else rolling.median()
        
        # Dispersion
        features[f'{value_col}_roll_std_{window}'] = rolling.std().reset_index(level=0, drop=True) if group_col else rolling.std()
        
        # Range
        features[f'{value_col}_roll_min_{window}'] = rolling.min().reset_index(level=0, drop=True) if group_col else rolling.min()
        features[f'{value_col}_roll_max_{window}'] = rolling.max().reset_index(level=0, drop=True) if group_col else rolling.max()
        
        # Sum (for cumulative metrics)
        features[f'{value_col}_roll_sum_{window}'] = rolling.sum().reset_index(level=0, drop=True) if group_col else rolling.sum()
        
        # Current value relative to rolling mean
        roll_mean = features[f'{value_col}_roll_mean_{window}']
        features[f'{value_col}_vs_roll_mean_{window}'] = df[value_col] / (roll_mean + 1e-8)
        
        # Z-score within rolling window
        roll_std = features[f'{value_col}_roll_std_{window}']
        features[f'{value_col}_roll_zscore_{window}'] = (
            (df[value_col] - roll_mean) / (roll_std + 1e-8)
        )
    
    return features
 
 
def create_ewm_features(
    df: pd.DataFrame,
    value_col: str,
    spans: list,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create Exponentially Weighted Moving Average features.
    
    EWMA gives more weight to recent observations.
    span=7 means the decay factor is 2/(7+1) = 0.25
    """
    features = pd.DataFrame(index=df.index)
    
    for span in spans:
        if group_col:
            ewm = df.groupby(group_col)[value_col].ewm(span=span, adjust=False)
        else:
            ewm = df[value_col].ewm(span=span, adjust=False)
        
        # EWMA mean
        features[f'{value_col}_ewm_mean_{span}'] = ewm.mean().reset_index(level=0, drop=True) if group_col else ewm.mean()
        
        # EWMA std
        features[f'{value_col}_ewm_std_{span}'] = ewm.std().reset_index(level=0, drop=True) if group_col else ewm.std()
    
    return features
 
 
def create_expanding_features(
    df: pd.DataFrame,
    value_col: str,
    group_col: str = None
) -> pd.DataFrame:
    """
    Create expanding window features (all history up to current point).
    """
    features = pd.DataFrame(index=df.index)
    
    if group_col:
        expanding = df.groupby(group_col)[value_col].expanding()
    else:
        expanding = df[value_col].expanding()
    
    # Historical statistics
    features[f'{value_col}_exp_mean'] = expanding.mean().reset_index(level=0, drop=True) if group_col else expanding.mean()
    features[f'{value_col}_exp_std'] = expanding.std().reset_index(level=0, drop=True) if group_col else expanding.std()
    features[f'{value_col}_exp_min'] = expanding.min().reset_index(level=0, drop=True) if group_col else expanding.min()
    features[f'{value_col}_exp_max'] = expanding.max().reset_index(level=0, drop=True) if group_col else expanding.max()
    
    # Current value percentile in history
    features[f'{value_col}_exp_rank'] = (
        df.groupby(group_col)[value_col].rank(pct=True) 
        if group_col else df[value_col].rank(pct=True)
    )
    
    return features
 
 
# Example usage
df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=100, freq='D'),
    'user_id': np.repeat(['A', 'B'], 50),
    'purchase_amount': np.random.exponential(100, 100)
})
df = df.sort_values(['user_id', 'date'])
 
# Create per-user rolling features
rolling_feats = create_rolling_features(
    df, 'purchase_amount', 
    windows=[7, 14, 30], 
    group_col='user_id'
)
print(rolling_feats.head(20))

Choosing Window Sizes

Trend and Seasonality Features

Lag features and rolling statistics capture patterns implicitly. Sometimes we want to explicitly model trend and seasonality components.

Trend Features:

Trend captures the long-term direction of a time series—is it growing, declining, or flat?

trend_seasonality.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
import pandas as pd
import numpy as np
from scipy import stats
 
def compute_trend_features(
    df: pd.DataFrame,
    value_col: str,
    window: int,
    group_col: str = None
) -> pd.DataFrame:
    """
    Compute trend features using linear regression over rolling windows.
    """
    features = pd.DataFrame(index=df.index)
    
    def rolling_slope(x):
        """Compute slope of linear regression over window."""
        if len(x) < 2 or x.isna().any():
            return np.nan
        try:
            slope, _, _, _, _ = stats.linregress(range(len(x)), x)
            return slope
        except:
            return np.nan
    
    if group_col:
        features[f'{value_col}_trend_slope_{window}'] = (
            df.groupby(group_col)[value_col]
            .rolling(window=window, min_periods=window//2)
            .apply(rolling_slope, raw=False)
            .reset_index(level=0, drop=True)
        )
    else:
        features[f'{value_col}_trend_slope_{window}'] = (
            df[value_col]
            .rolling(window=window, min_periods=window//2)
            .apply(rolling_slope, raw=False)
        )
    
    # Trend direction as categorical
    features[f'{value_col}_trend_direction_{window}'] = np.where(
        features[f'{value_col}_trend_slope_{window}'] > 0.05, 'up',
        np.where(features[f'{value_col}_trend_slope_{window}'] < -0.05, 'down', 'flat')
    )
    
    return features
 
 
def decompose_time_series(
    df: pd.DataFrame,
    value_col: str,
    datetime_col: str,
    period: int = None
) -> pd.DataFrame:
    """
    Decompose time series into trend, seasonal, and residual components.
    Uses STL decomposition.
    """
    from statsmodels.tsa.seasonal import STL
    
    # Prepare time series
    ts = df.set_index(datetime_col)[value_col]
    ts = ts.asfreq('D')  # Assume daily; adjust as needed
    ts = ts.interpolate()  # Fill gaps
    
    # Determine period if not specified
    if period is None:
        period = 7  # Weekly seasonality
    
    # STL decomposition
    stl = STL(ts, period=period, robust=True)
    result = stl.fit()
    
    # Create feature DataFrame
    decomposed = pd.DataFrame({
        f'{value_col}_trend': result.trend,
        f'{value_col}_seasonal': result.seasonal,
        f'{value_col}_residual': result.resid,
        f'{value_col}_trend_strength': 1 - (result.resid.var() / (result.trend + result.resid).var()),
        f'{value_col}_seasonal_strength': 1 - (result.resid.var() / (result.seasonal + result.resid).var()),
    })
    
    return decomposed
 
 
def create_seasonal_features(
    df: pd.DataFrame,
    value_col: str,
    datetime_col: str,
    periods: dict = None
) -> pd.DataFrame:
    """
    Create features capturing seasonal patterns.
    
    Parameters:
    -----------
    periods: Dict of period names to period lengths
             {'weekly': 7, 'monthly': 30, 'yearly': 365}
    """
    if periods is None:
        periods = {'weekly': 7, 'monthly': 30}
    
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    for name, period in periods.items():
        # Seasonal index position
        if name == 'weekly':
            position = dt.dt.dayofweek / 7
        elif name == 'monthly':
            position = dt.dt.day / 31
        elif name == 'yearly':
            position = dt.dt.dayofyear / 365
        else:
            position = (dt - dt.min()).dt.days % period / period
        
        # Fourier components for smooth seasonal curves
        for k in range(1, 4):  # First 3 harmonics
            features[f'{name}_sin_{k}'] = np.sin(2 * np.pi * k * position)
            features[f'{name}_cos_{k}'] = np.cos(2 * np.pi * k * position)
    
    return features
 
 
# Momentum features (rate of change indicators)
def create_momentum_features(
    df: pd.DataFrame,
    value_col: str,
    periods: list = [7, 14, 30]
) -> pd.DataFrame:
    """
    Create momentum/rate-of-change features.
    """
    features = pd.DataFrame(index=df.index)
    
    for period in periods:
        # Simple rate of change
        features[f'{value_col}_roc_{period}'] = df[value_col].pct_change(periods=period)
        
        # Rate of change of rate of change (acceleration)
        features[f'{value_col}_roc_roc_{period}'] = features[f'{value_col}_roc_{period}'].diff()
        
        # Momentum (current / past)
        features[f'{value_col}_momentum_{period}'] = df[value_col] / df[value_col].shift(period)
    
    # Short vs Long momentum (trend strength)
    if len(periods) >= 2:
        short, long = periods[0], periods[-1]
        features[f'{value_col}_momentum_diff'] = (
            features[f'{value_col}_momentum_{short}'] - 
            features[f'{value_col}_momentum_{long}']
        )
    
    return features

Event-Based Time Features

Beyond calendar cycles, events create temporal structure. User behaviors are anchored to:

Account creation → 'Days since signup'
Last purchase → 'Days since last order'
Support contact → 'Days since last complaint'
Marketing email → 'Hours since last email'

These relative time features often outperform absolute calendar features because they capture user-specific temporal context.

event_based_features.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
import pandas as pd
import numpy as np
 
def create_recency_features(
    df: pd.DataFrame,
    entity_col: str,
    event_col: str,
    datetime_col: str,
    ref_datetime: pd.Timestamp = None
) -> pd.DataFrame:
    """
    Create recency features (time since last event).
    """
    if ref_datetime is None:
        ref_datetime = pd.Timestamp.now()
    
    features = pd.DataFrame(index=df.index)
    dt = pd.to_datetime(df[datetime_col])
    
    # Time since event
    days_since = (ref_datetime - dt).dt.days
    features[f'days_since_{event_col}'] = days_since
    
    # Binned recency
    features[f'recency_bucket_{event_col}'] = pd.cut(
        days_since,
        bins=[0, 7, 30, 90, 180, 365, np.inf],
        labels=['<1w', '1w-1m', '1m-3m', '3m-6m', '6m-1y', '>1y']
    )
    
    # Log transform (diminishing recency importance)
    features[f'log_days_since_{event_col}'] = np.log1p(days_since)
    
    # Decay features (exponential decay of event importance)
    for half_life in [7, 30, 90]:
        decay = np.exp(-days_since * np.log(2) / half_life)
        features[f'{event_col}_decay_{half_life}d'] = decay
    
    return features
 
 
def create_inter_event_features(
    df: pd.DataFrame,
    entity_col: str,
    datetime_col: str
) -> pd.DataFrame:
    """
    Create features about time between events for each entity.
    """
    # Sort by entity and time
    df_sorted = df.sort_values([entity_col, datetime_col])
    
    features = pd.DataFrame(index=df_sorted.index)
    dt = pd.to_datetime(df_sorted[datetime_col])
    
    # Time since previous event (for this entity)
    features['days_since_prev_event'] = (
        dt - df_sorted.groupby(entity_col)[datetime_col].shift(1)
    ).dt.days
    
    # Time until next event
    features['days_until_next_event'] = (
        df_sorted.groupby(entity_col)[datetime_col].shift(-1) - dt
    ).dt.days
    
    # Event frequency (rolling count)
    features['events_last_30d'] = (
        df_sorted.groupby(entity_col)
        .rolling('30D', on=datetime_col)
        .size()
        .reset_index(level=0, drop=True)
    )
    
    # Regularity (std of inter-event times)
    features['inter_event_std_30d'] = (
        df_sorted.groupby(entity_col)['days_since_prev_event']
        .rolling(window=30, min_periods=3)
        .std()
        .reset_index(level=0, drop=True)
    )
    
    # Is accelerating? (current gap < average gap)
    avg_gap = df_sorted.groupby(entity_col)['days_since_prev_event'].transform('mean')
    features['is_accelerating'] = (features['days_since_prev_event'] < avg_gap).astype(int)
    
    return features
 
 
def create_lifecycle_features(
    df: pd.DataFrame,
    entity_col: str,
    event_datetime_col: str,
    signup_datetime_col: str
) -> pd.DataFrame:
    """
    Create lifecycle/tenure features.
    """
    features = pd.DataFrame(index=df.index)
    
    event_dt = pd.to_datetime(df[event_datetime_col])
    signup_dt = pd.to_datetime(df[signup_datetime_col])
    
    # Tenure at time of event
    features['tenure_days'] = (event_dt - signup_dt).dt.days
    features['tenure_weeks'] = features['tenure_days'] / 7
    features['tenure_months'] = features['tenure_days'] / 30
    
    # Lifecycle stage
    features['lifecycle_stage'] = pd.cut(
        features['tenure_days'],
        bins=[0, 7, 30, 90, 365, np.inf],
        labels=['new', 'activated', 'engaged', 'mature', 'veteran']
    )
    
    # Event position in user's history
    features['event_number'] = df.groupby(entity_col).cumcount() + 1
    
    # Event rate (events per tenure day)
    features['event_rate'] = features['event_number'] / (features['tenure_days'] + 1)
    
    return features

Decay Functions Encode Time Importance

Summary: Mastering Temporal Features

Time-based features often provide the single largest lift in predictive performance. They capture the rhythms of human behavior, business cycles, and natural phenomena. Here are the key insights:

Key Takeaways

•Calendar features extract human-meaningful components (hour, day, month). Include business-relevant flags (business hours, holidays, weekends).
•Cyclical encoding using sin/cos preserves adjacency across cycle boundaries—critical for distance-based algorithms.
•Lag features capture autocorrelation by using past values to predict present. Be wary of data leakage with target lags.
•Rolling statistics summarize recent behavior patterns (mean, std, min, max). Multiple window sizes capture both short-term changes and long-term baselines.
•Trend features explicitly model direction (slope) over time. Seasonality features capture repeating patterns at various frequencies.
•Event-based features (recency, inter-event times, lifecycle) often outperform calendar features by capturing user-specific temporal context.
•Always verify temporal validity: Would this feature be available at prediction time in production?

Page Complete

4 / 5