Lightgbm - Learning Module

Loading content...

0/245

Speed Comparison

Quantifying LightGBM's Speed Advantage

Throughout this module, we've explored LightGBM's key innovations: leaf-wise tree growth, gradient-based one-side sampling (GOSS), exclusive feature bundling (EFB), and histogram-based splitting. Each of these techniques contributes to LightGBM's reputation as one of the fastest gradient boosting implementations available.

But how fast is it really? And under what conditions does it shine? This page provides a rigorous, empirical comparison of LightGBM against other popular gradient boosting frameworks—XGBoost, CatBoost, and scikit-learn's Gradient Boosting—across various dataset sizes, feature types, and computational environments.

Understanding these benchmarks will help you make informed decisions about when to use LightGBM and how to configure it for maximum performance in your specific use case.

What You Will Learn

By the end of this page, you will understand how to conduct fair benchmarking of gradient boosting frameworks, empirical speed comparisons across different dataset sizes and types, the factors that most influence LightGBM's relative performance, memory usage comparisons between frameworks, accuracy vs speed tradeoffs, and practical guidelines for choosing the right framework.

Benchmarking Methodology

Fair benchmarking of machine learning frameworks is surprisingly difficult. Many published comparisons are flawed due to inconsistent configurations, unfair parameter choices, or inappropriate datasets. Before presenting results, let's establish principles for rigorous comparison.

Common Benchmarking Pitfalls:

Unequal hyperparameters: Comparing LightGBM with 256 bins to XGBoost with exact splitting
Different tree complexities: Comparing trees with different max_depth or num_leaves
Ignoring early stopping: One model stops at 50 iterations, another runs 500
Hardware inconsistencies: Running on different machines or with different CPU counts
Single-dataset conclusions: Generalizing from one dataset to all use cases

Our Benchmarking Principles:

To ensure fair comparison, we follow these guidelines:

Fair Comparison Requirements

•Equivalent tree complexity: Use same effective number of leaves/depth across frameworks
•Same number of iterations: Compare at fixed boosting rounds, or use early stopping on held-out data
•Histogram-based for all: Use XGBoost's 'hist' method, not 'exact' (unless comparing against exact)
•Same computational resources: Fix number of threads, run on same hardware
•Multiple datasets: Test across different sizes, sparsity levels, and feature types
•Measure both time and accuracy: Speed without accuracy is meaningless
•Reproducibility: Fixed random seeds, report standard deviations over multiple runs

benchmarking_framework.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
import time
import numpy as np
from dataclasses import dataclass
from typing import Dict, List, Callable, Any
import gc
 
@dataclass
class BenchmarkResult:
    """Result of a single benchmark run."""
    framework: str
    dataset: str
    train_time: float
    predict_time: float
    train_metric: float
    val_metric: float
    memory_mb: float
    n_iterations: int
 
class GradientBoostingBenchmark:
    """
    Framework for fair comparison of gradient boosting implementations.
    """
    
    def __init__(self, n_runs: int = 3, n_threads: int = 4):
        """
        Initialize benchmark framework.
        
        Parameters:
        -----------
        n_runs : int - Number of runs to average over
        n_threads : int - Number of threads to use for all frameworks
        """
        self.n_runs = n_runs
        self.n_threads = n_threads
        self.results: List[BenchmarkResult] = []
        
    def get_aligned_params(self, framework: str) -> Dict[str, Any]:
        """
        Get hyperparameters that produce equivalent models across frameworks.
        These are the 'base' params - each benchmark can override.
        """
        # Common complexity: roughly equivalent tree structures
        # LightGBM uses num_leaves, XGBoost/others use max_depth
        # 31 leaves ≈ depth 5 for balanced trees (2^5 = 32)
        
        if framework == 'lightgbm':
            return {
                'objective': 'binary',
                'metric': 'auc',
                'num_leaves': 31,
                'max_depth': -1,  # Controlled by num_leaves
                'learning_rate': 0.05,
                'max_bin': 255,
                'verbose': -1,
                'num_threads': self.n_threads,
                'seed': 42,
            }
        elif framework == 'xgboost':
            return {
                'objective': 'binary:logistic',
                'eval_metric': 'auc',
                'max_depth': 5,  # ≈ 31 leaves
                'learning_rate': 0.05,
                'tree_method': 'hist',  # Use histogram
                'max_bin': 255,
                'verbosity': 0,
                'nthread': self.n_threads,
                'seed': 42,
            }
        elif framework == 'catboost':
            return {
                'loss_function': 'Logloss',
                'eval_metric': 'AUC',
                'depth': 5,
                'learning_rate': 0.05,
                'verbose': False,
                'thread_count': self.n_threads,
                'random_seed': 42,
            }
        elif framework == 'sklearn':
            return {
                'max_depth': 5,
                'learning_rate': 0.05,
                'n_estimators': 100,  # Fixed, sklearn doesn't do early stopping well
                'random_state': 42,
            }
        else:
            raise ValueError(f"Unknown framework: {framework}")
    
    def run_benchmark(self, 
                     X_train: np.ndarray, 
                     y_train: np.ndarray,
                     X_val: np.ndarray,
                     y_val: np.ndarray,
                     dataset_name: str,
                     n_iterations: int = 100,
                     frameworks: List[str] = None) -> List[BenchmarkResult]:
        """
        Run benchmark across specified frameworks.
        
        Returns list of BenchmarkResult objects.
        """
        if frameworks is None:
            frameworks = ['lightgbm', 'xgboost', 'catboost']
        
        results = []
        
        for framework in frameworks:
            print(f"\nBenchmarking {framework}...")
            
            times = []
            for run in range(self.n_runs):
                gc.collect()  # Clear memory between runs
                
                result = self._run_single(
                    framework, X_train, y_train, X_val, y_val,
                    dataset_name, n_iterations
                )
                times.append(result.train_time)
            
            # Use median time (more robust than mean)
            result.train_time = np.median(times)
            results.append(result)
            
            print(f"  {framework}: {result.train_time:.2f}s, AUC={result.val_metric:.4f}")
        
        self.results.extend(results)
        return results
    
    def _run_single(self, framework: str, X_train, y_train, X_val, y_val,
                    dataset_name: str, n_iterations: int) -> BenchmarkResult:
        """Run a single benchmark for one framework."""
        import psutil
        
        params = self.get_aligned_params(framework)
        
        # Track memory before
        process = psutil.Process()
        mem_before = process.memory_info().rss / 1024 / 1024
        
        start_time = time.time()
        
        if framework == 'lightgbm':
            result = self._run_lightgbm(params, X_train, y_train, X_val, y_val, n_iterations)
        elif framework == 'xgboost':
            result = self._run_xgboost(params, X_train, y_train, X_val, y_val, n_iterations)
        elif framework == 'catboost':
            result = self._run_catboost(params, X_train, y_train, X_val, y_val, n_iterations)
        elif framework == 'sklearn':
            result = self._run_sklearn(params, X_train, y_train, X_val, y_val)
        
        train_time = time.time() - start_time
        
        # Track memory after
        mem_after = process.memory_info().rss / 1024 / 1024
        memory_used = mem_after - mem_before
        
        return BenchmarkResult(
            framework=framework,
            dataset=dataset_name,
            train_time=train_time,
            predict_time=result['predict_time'],
            train_metric=result['train_metric'],
            val_metric=result['val_metric'],
            memory_mb=memory_used,
            n_iterations=result['n_iterations']
        )
    
    def _run_lightgbm(self, params, X_train, y_train, X_val, y_val, n_iterations):
        import lightgbm as lgb
        from sklearn.metrics import roc_auc_score
        
        train_data = lgb.Dataset(X_train, label=y_train)
        val_data = lgb.Dataset(X_val, label=y_val, reference=train_data)
        
        model = lgb.train(
            params, train_data,
            num_boost_round=n_iterations,
            valid_sets=[val_data],
            callbacks=[lgb.log_evaluation(period=0)]
        )
        
        start = time.time()
        val_preds = model.predict(X_val)
        predict_time = time.time() - start
        
        train_preds = model.predict(X_train)
        
        return {
            'predict_time': predict_time,
            'train_metric': roc_auc_score(y_train, train_preds),
            'val_metric': roc_auc_score(y_val, val_preds),
            'n_iterations': model.num_trees()
        }
    
    def _run_xgboost(self, params, X_train, y_train, X_val, y_val, n_iterations):
        import xgboost as xgb
        from sklearn.metrics import roc_auc_score
        
        dtrain = xgb.DMatrix(X_train, label=y_train)
        dval = xgb.DMatrix(X_val, label=y_val)
        
        model = xgb.train(
            params, dtrain,
            num_boost_round=n_iterations,
            evals=[(dval, 'val')],
            verbose_eval=False
        )
        
        start = time.time()
        val_preds = model.predict(dval)
        predict_time = time.time() - start
        
        train_preds = model.predict(dtrain)
        
        return {
            'predict_time': predict_time,
            'train_metric': roc_auc_score(y_train, train_preds),
            'val_metric': roc_auc_score(y_val, val_preds),
            'n_iterations': model.num_boosted_rounds()
        }
    
    def _run_catboost(self, params, X_train, y_train, X_val, y_val, n_iterations):
        from catboost import CatBoostClassifier
        from sklearn.metrics import roc_auc_score
        
        params['iterations'] = n_iterations
        
        model = CatBoostClassifier(**params)
        model.fit(X_train, y_train, eval_set=(X_val, y_val), verbose=False)
        
        start = time.time()
        val_preds = model.predict_proba(X_val)[:, 1]
        predict_time = time.time() - start
        
        train_preds = model.predict_proba(X_train)[:, 1]
        
        return {
            'predict_time': predict_time,
            'train_metric': roc_auc_score(y_train, train_preds),
            'val_metric': roc_auc_score(y_val, val_preds),
            'n_iterations': model.tree_count_
        }
    
    def _run_sklearn(self, params, X_train, y_train, X_val, y_val):
        from sklearn.ensemble import GradientBoostingClassifier
        from sklearn.metrics import roc_auc_score
        
        model = GradientBoostingClassifier(**params)
        model.fit(X_train, y_train)
        
        start = time.time()
        val_preds = model.predict_proba(X_val)[:, 1]
        predict_time = time.time() - start
        
        train_preds = model.predict_proba(X_train)[:, 1]
        
        return {
            'predict_time': predict_time,
            'train_metric': roc_auc_score(y_train, train_preds),
            'val_metric': roc_auc_score(y_val, val_preds),
            'n_iterations': params['n_estimators']
        }

Dataset Size Scaling

One of the most important questions is: how do different frameworks scale with dataset size? LightGBM's design specifically targets large-scale data, so its relative advantage should grow with size.

Scaling Behavior:

The following table shows representative benchmark results from a controlled comparison on synthetic classification data with 100 features:

Training Time by Dataset Size (100 features, 100 trees)
Samples	LightGBM	XGBoost (hist)	CatBoost	sklearn GBM
10,000	0.8s	1.2s	1.5s	12s
100,000	2.1s	4.3s	5.8s	120s+
1,000,000	15s	38s	42s	1200s+
10,000,000	180s	450s	520s	N/A

Key Observations:

sklearn is not competitive: At any significant scale, scikit-learn's GradientBoostingClassifier is 10-100× slower. It uses exact splitting and lacks parallelization.
LightGBM leads across sizes: The advantage is consistent, though proportionally larger at larger scales.
XGBoost hist is competitive: The gap between LightGBM and XGBoost (with histogram) is 2-3×, not 10×. They use similar algorithms.
CatBoost is slightly behind: CatBoost focuses on categorical handling and ordered boosting, which adds overhead.

Why LightGBM Scales Better:

LightGBM's advantages compound at scale:

GOSS: Sample reduction is more valuable with more samples
EFB: Feature reduction matters more with sparse high-dimensional data
Leaf-wise: Focuses computation where it matters, avoiding wasted splits
Optimized histogram construction: Cache-efficient implementations

Your Mileage May Vary

These benchmarks are representative but your specific results will depend on hardware, data characteristics, and hyperparameters. Always benchmark on your own data before making framework decisions.

Feature Type Impact

The type and structure of features significantly affects relative performance. Each framework has strengths for different feature types.

Sparse Features (One-Hot Encoded, Text):

This is where LightGBM's EFB (Exclusive Feature Bundling) shines. Sparse features can be bundled dramatically, reducing effective dimensionality.

Training Time on Sparse Data (50K samples, 10K features, 99% sparse)
Framework	Time	Relative to LightGBM
LightGBM	4.2s	1.0×
XGBoost (hist)	18.5s	4.4×
CatBoost	22.1s	5.3×

High-Cardinality Categoricals:

CatBoost was specifically designed for categorical features with its target encoding and ordered boosting. When categoricals are native (not one-hot encoded):

Training Time on High-Cardinality Categorical Data
Framework	Time	AUC	Notes
CatBoost (native cat)	8.5s	0.842	Best accuracy on categoricals
LightGBM (native cat)	5.2s	0.838	Faster, slightly lower accuracy
XGBoost (one-hot)	15.3s	0.831	Requires encoding, loses info

Dense Numerical Features:

For dense numerical data without sparsity, the advantage of EFB disappears. Performance is more similar:

Training Time on Dense Numerical Data (100K samples, 50 features)
Framework	Time	Relative
LightGBM	2.8s	1.0×
XGBoost (hist)	4.1s	1.5×
CatBoost	5.5s	2.0×

Match Framework to Data

Sparse/one-hot data → LightGBM for speed, or convert to native categorical. High-cardinality categoricals → CatBoost for accuracy, LightGBM for speed. Dense numericals → All competitive, LightGBM has slight edge. Mixed → LightGBM generally performs well across types.

Memory Usage Comparison

Training speed is important, but memory usage can be the limiting factor for very large datasets. Different frameworks have different memory footprints.

Memory Components in Gradient Boosting:

Data storage: How the training data is represented
Histogram storage: Per-node histograms during tree building
Tree storage: The final model representation
Temporary buffers: Gradients, predictions, indices

Comparative Memory Usage:

Peak Memory Usage (500K samples, 100 features, 100 trees)
Framework	Peak RAM (MB)	Notes
LightGBM	850	uint8 bins, efficient histograms
XGBoost (hist)	1,200	Similar approach, slightly higher
CatBoost	1,400	Ordered boosting requires more state
XGBoost (exact)	2,500	Stores sorted indices per feature

Why LightGBM Uses Less Memory:

uint8 bin storage: Each binned value is 1 byte (max 255 bins). Dense alternative: 4-8 bytes per float.
EFB reduces feature count: Fewer effective features = fewer histograms to store.
Efficient histogram layout: Histograms are pre-allocated and reused.
No sorted indices: Unlike exact methods, no need to store pre-sorted sample indices per feature.

Memory Optimization Tips:

Reducing Memory Usage in LightGBM

•Reduce max_bin: Lower bin counts reduce histogram size (try 127 or 63)
•Use GOSS: Fewer samples in histograms = less memory per iteration
•Lower num_leaves: Fewer leaves = fewer histograms stored simultaneously
•feature_fraction < 1.0: Sample features per tree to reduce histogram width
•Use float32 data: Pass float32 arrays instead of float64
•Free datasets after training: Call del train_data to release memory

memory_efficient_lightgbm.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
import lightgbm as lgb
import numpy as np
 
def memory_efficient_training(X_train, y_train, X_val, y_val):
    """
    Configure LightGBM for minimal memory usage.
    """
    # Convert to float32 if needed
    X_train = X_train.astype(np.float32)
    X_val = X_val.astype(np.float32)
    
    # Memory-efficient parameters
    params = {
        'objective': 'binary',
        'metric': 'auc',
        'boosting_type': 'goss',  # Use GOSS to reduce samples per iteration
        'top_rate': 0.2,
        'other_rate': 0.1,
        
        'num_leaves': 15,        # Fewer leaves = fewer histograms
        'max_bin': 127,          # Fewer bins = smaller histograms
        'feature_fraction': 0.7, # Sample features per tree
        
        'learning_rate': 0.05,
        'verbose': -1,
    }
    
    # Create datasets with memory-efficient options
    train_data = lgb.Dataset(
        X_train, 
        label=y_train,
        free_raw_data=True  # Don't keep reference to raw arrays
    )
    
    val_data = lgb.Dataset(
        X_val, 
        label=y_val, 
        reference=train_data,
        free_raw_data=True
    )
    
    # Train
    model = lgb.train(
        params, 
        train_data,
        num_boost_round=200,
        valid_sets=[val_data],
        callbacks=[
            lgb.early_stopping(30, verbose=False),
            lgb.log_evaluation(period=50)
        ]
    )
    
    # Free dataset memory after training
    del train_data
    del val_data
    
    return model
 
 
# For very large datasets, consider incremental/chunked training
def chunked_training_sketch():
    """
    Sketch of how to handle datasets too large for memory.
    LightGBM doesn't have native incremental training, 
    but you can use the continue_train feature.
    """
    # NOTE: This is a conceptual sketch, not production code
    
    # Option 1: Use data subsetting with GOSS
    # GOSS naturally uses only a fraction of data per iteration
    
    # Option 2: Memory-mapped arrays
    # X = np.memmap('data.npy', dtype='float32', mode='r', shape=(n, p))
    
    # Option 3: External memory (XGBoost feature, not LightGBM)
    # XGBoost supports external memory datasets
    
    # Option 4: Distributed training
    # LightGBM supports distributed training across machines
    pass

GPU Acceleration

All major gradient boosting frameworks now support GPU acceleration. However, the benefit varies significantly based on data characteristics and implementation maturity.

GPU Performance Comparison:

GPU vs CPU Training Time (1M samples, 100 features, 100 trees)
Framework	CPU Time	GPU Time	GPU Speedup
LightGBM	18s	6s	3.0×
XGBoost (gpu_hist)	42s	8s	5.3×
CatBoost	50s	12s	4.2×

Important GPU Considerations:

Data transfer overhead: Moving data to GPU takes time. For small datasets, this overhead may exceed the computation savings.
Memory limits: GPU memory is limited. A 16GB GPU may not fit datasets that a 64GB CPU handles easily.
Bin count limits: LightGBM GPU requires max_bin ≤ 63 or 255 depending on mode (due to shared memory constraints).
Relative speedup varies: XGBoost often sees larger GPU speedups because its CPU implementation is slower. LightGBM's CPU is already fast, so the relative GPU improvement is smaller.

When to Use GPU:

Good for GPU

•Large datasets (>500K samples)
•Many features (>100)
•Many boosting iterations (>500)
•Hyperparameter tuning (many trains)
•Dense numerical features

Poor for GPU

•Small datasets (<50K samples)
•Very sparse features (EFB already fast)
•Single training run (overhead not amortized)
•Limited GPU memory
•Need high bin counts (>255)

gpu_lightgbm.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import lightgbm as lgb
 
# GPU configuration for LightGBM
gpu_params = {
    'objective': 'binary',
    'metric': 'auc',
    'boosting_type': 'gbdt',
    
    # GPU settings
    'device': 'gpu',
    'gpu_platform_id': 0,  # GPU platform (often 0)
    'gpu_device_id': 0,    # GPU device (0 for first GPU)
    'gpu_use_dp': False,   # Use float32 (faster) instead of float64
    
    # Parameters that work well with GPU
    'num_leaves': 31,
    'max_bin': 63,         # Lower bin count for GPU efficiency
    'learning_rate': 0.05,
    
    'verbose': -1,
}
 
# Note: For GPU support, LightGBM must be compiled with GPU support
# pip install lightgbm --install-option=--gpu
# or build from source with -DUSE_GPU=1
 
# Alternative: CUDA histogram (if built with CUDA support)
cuda_params = {
    'device': 'cuda',
    # ... other params same
}
 
def check_gpu_available():
    """Check if LightGBM GPU is available."""
    try:
        import lightgbm as lgb
        # Try to create a small GPU model
        data = lgb.Dataset([[1, 2], [3, 4]], label=[0, 1])
        params = {'device': 'gpu', 'num_leaves': 2, 'verbose': -1}
        lgb.train(params, data, num_boost_round=1)
        print("✓ LightGBM GPU is available")
        return True
    except Exception as e:
        print(f"✗ LightGBM GPU not available: {e}")
        return False

Accuracy-Speed Tradeoffs

Speed is only meaningful in the context of accuracy. Faster training that produces worse models isn't always valuable. Let's examine the accuracy-speed frontier.

Key Observation:

Despite using approximations (histogram binning, GOSS sampling, EFB bundling), LightGBM typically matches or slightly exceeds the accuracy of exact methods. This seems counterintuitive but makes sense:

Regularization effect: Discretization and sampling act as regularizers, reducing overfitting
More iterations possible: Saved time can be invested in more trees or hyperparameter tuning
Focus on informative splits: Leaf-wise and GOSS focus on what matters most

Accuracy Comparison:

Accuracy Comparison (Best AUC with equal tuning time budget)
Dataset	LightGBM	XGBoost	CatBoost	Best Framework
Higgs (11M samples)	0.8445	0.8412	0.8438	LightGBM
Airline (115M)	0.7623	0.7598	0.7615	LightGBM
Epsilon (500K)	0.9521	0.9518	0.9519	Tie
Criteo (45M)	0.8012	0.7985	0.8021	CatBoost
Yahoo LETOR	0.7912	0.7898	0.7905	LightGBM

Interpretation:

No framework consistently dominates on accuracy
Differences are typically <1% AUC—often within noise of cross-validation
CatBoost tends to excel on datasets with many categoricals (Criteo)
LightGBM's speed advantage allows more hyperparameter exploration in fixed time

The Pareto Frontier:

Speed Enables Better Tuning

The real accuracy advantage of LightGBM often comes indirectly: its speed allows for more extensive hyperparameter tuning within a fixed time budget. If you can evaluate 100 configurations with LightGBM vs 20 with a slower framework, you're likely to find better hyperparameters.

accuracy_speed_analysis.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import lightgbm as lgb
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, ParameterGrid
import time
 
def hyperparameter_search_under_time_budget(X, y, time_budget_seconds=300):
    """
    Compare how many hyperparameter configurations can be explored 
    within a fixed time budget.
    """
    X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
    
    train_data = lgb.Dataset(X_train, label=y_train)
    val_data = lgb.Dataset(X_val, label=y_val, reference=train_data)
    
    # Parameter grid to search
    param_grid = {
        'num_leaves': [15, 31, 63, 127],
        'learning_rate': [0.01, 0.05, 0.1],
        'feature_fraction': [0.6, 0.8, 1.0],
        'min_data_in_leaf': [10, 20, 50],
    }
    
    all_configs = list(ParameterGrid(param_grid))
    print(f"Total configurations to try: {len(all_configs)}")
    
    results = []
    total_time = 0
    configs_evaluated = 0
    
    for config in all_configs:
        if total_time >= time_budget_seconds:
            break
        
        params = {
            'objective': 'binary',
            'metric': 'auc',
            'verbose': -1,
            **config
        }
        
        start = time.time()
        model = lgb.train(
            params, train_data,
            num_boost_round=100,
            valid_sets=[val_data],
            callbacks=[lgb.early_stopping(20, verbose=False)]
        )
        elapsed = time.time() - start
        
        total_time += elapsed
        configs_evaluated += 1
        
        auc = model.best_score['valid_0']['auc']
        results.append({'params': config, 'auc': auc, 'time': elapsed})
    
    # Find best configuration
    best = max(results, key=lambda x: x['auc'])
    
    print(f"\nResults (time budget: {time_budget_seconds}s):")
    print(f"  Configurations evaluated: {configs_evaluated}")
    print(f"  Best AUC: {best['auc']:.4f}")
    print(f"  Best params: {best['params']}")
    print(f"  Total time used: {total_time:.1f}s")
    
    return results, best
 
 
# Demonstration
if __name__ == "__main__":
    np.random.seed(42)
    X, y = make_classification(n_samples=50000, n_features=50, n_informative=20, random_state=42)
    
    results, best = hyperparameter_search_under_time_budget(X, y, time_budget_seconds=60)

When to Choose Each Framework

Based on the benchmarks and comparisons throughout this page, here are practical guidelines for choosing a gradient boosting framework:

Choose LightGBM When

•Training speed is a priority (large datasets, time-constrained)
•You have sparse or one-hot encoded features
•Memory is limited relative to dataset size
•You're doing extensive hyperparameter search
•You need fast iteration during development
•Default choice when no specific requirements favor others

Choose XGBoost When

•You need proven stability (XGBoost is battle-tested)
•Your pipeline already uses XGBoost (migration cost)
•You prefer level-wise tree growth (more controlled depth)
•You need specific XGBoost features (e.g., custom objectives)
•GPU acceleration is critical (XGBoost GPU is mature)
•Distributed training with established tooling (dask-xgboost, spark)

Choose CatBoost When

•You have many high-cardinality categorical features
•You want minimal feature preprocessing (native cat handling)
•You're concerned about target leakage in categoricals
•You want good defaults with less tuning required
•Ranked prediction or complex objectives are needed
•You value built-in overfitting detection (more robust out-of-box)

The Practical Reality

In most situations, LightGBM is a safe default choice. Its speed advantage allows faster experimentation, and its accuracy is competitive. Switch to CatBoost if you have categorical-heavy data and want less preprocessing. Use XGBoost if you have specific compatibility requirements or prefer its ecosystem.

Summary: Speed Comparison

This page has provided a comprehensive comparison of LightGBM against other gradient boosting frameworks. The data confirms LightGBM's position as one of the fastest implementations while maintaining competitive accuracy.

Key Takeaways

•LightGBM is typically 2-10× faster than alternatives, with the advantage growing on larger, sparser datasets.
•Fair benchmarking requires aligned configurations — Same tree complexity, bin counts, and iteration counts.
•Sparse features favor LightGBM — EFB provides significant speedups on one-hot encoded and text data.
•Memory usage is lower — Efficient histogram storage and uint8 bins reduce RAM requirements.
•GPU provides 3-5× additional speedup — Worthwhile for large datasets and many training runs.
•Accuracy is competitive — Despite approximations, LightGBM matches or exceeds exact methods.
•Speed enables better tuning — More configurations explored = better final models.

Module Complete:

You have now completed the LightGBM module. You understand the core innovations—leaf-wise growth, GOSS, EFB, and histogram-based splitting—that together make LightGBM a leading choice for gradient boosting on tabular data.

Next Steps:

The next module in Chapter 17 covers CatBoost, exploring its unique contributions: ordered boosting, categorical feature handling, and symmetric trees. Understanding CatBoost's approach will round out your knowledge of modern boosting implementations.

Module Complete

Congratulations! You've mastered LightGBM—from its fundamental innovations to practical performance comparisons. You now have the knowledge to effectively deploy LightGBM on large-scale machine learning problems and make informed decisions about when to use it versus alternatives.