Automated Model Selection - Learning Module

Loading content...

0/278

Portfolio Methods

Don't Put All Your Eggs in One Algorithm

Algorithm selection asks: Which single algorithm should I use? But there's an alternative philosophy that often proves more robust: Why choose just one?

Portfolio methods run multiple algorithms—a carefully chosen set—and combine their predictions. Instead of betting everything on one algorithm, we diversify. The result is often more robust and competitive with the best single choice.

The Investment Analogy:

In finance, portfolios diversify risk. A diversified stock portfolio rarely beats the single best stock, but it consistently outperforms the average stock and protects against catastrophic losses. The same principle applies to machine learning:

Single algorithm: High variance in performance across datasets
Portfolio of algorithms: Lower variance, more consistent results

Why Portfolios Work in AutoML:

Selection is uncertain: Even good meta-learning can't perfectly predict the best algorithm
Close performances: Often, several algorithms perform similarly; selection doesn't matter much
Ensembling bonus: Combining diverse algorithms often beats any single one
Robustness: Portfolios protect against catastrophic selection failures

This page explores portfolio construction, algorithm scheduling, and ensemble selection—completing our toolkit for automated model selection.

What You Will Learn

By the end of this page, you will understand the portfolio approach to AutoML, static vs dynamic portfolio construction, algorithm schedule optimization, ensemble selection for combining trained models, and how systems like H2O AutoML and Auto-sklearn implement portfolios.

Static Algorithm Portfolios

A static portfolio is a fixed set of algorithms run on every dataset. The philosophy: some algorithms are so generally good that they should always be tried.

Designing Static Portfolios:

A good static portfolio has:

Coverage: Includes algorithms that excel on different dataset types
Diversity: Algorithms with different inductive biases that make different errors
Efficiency: Algorithms that are computationally tractable
Robustness: Algorithms with sensible default hyperparameters

Common Static Portfolios:

Tabular Data Portfolio:

Gradient Boosting (XGBoost/LightGBM/CatBoost)
Random Forest
Linear Model (Logistic Regression / Ridge)
Neural Network (MLP)
k-Nearest Neighbors (for baseline)

H2O AutoML's Portfolio:

XGBoost (various hyperparameter configurations)
LightGBM
CatBoost
GBM (proprietary H2O implementation)
Random Forest
Extremely Randomized Trees
Deep Learning (MLP)
GLM (Generalized Linear Model)
Stacked Ensemble (combines all above)

static_portfolio.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
import numpy as np
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.ensemble import ExtraTreesClassifier, AdaBoostClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
from typing import Dict, List, Tuple, Any, Callable
import time
 
class StaticPortfolio:
    """
    Static algorithm portfolio for classification.
    
    Runs a fixed set of algorithms and provides:
    1. Individual predictions from each algorithm
    2. Ensemble prediction (voting or stacking)
    3. Best single algorithm selection
    """
    
    # Default portfolio with sensible defaults
    DEFAULT_PORTFOLIO = {
        'RandomForest': lambda: RandomForestClassifier(
            n_estimators=100, max_depth=None, random_state=42
        ),
        'GradientBoosting': lambda: GradientBoostingClassifier(
            n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42
        ),
        'ExtraTrees': lambda: ExtraTreesClassifier(
            n_estimators=100, random_state=42
        ),
        'LogisticRegression': lambda: LogisticRegression(
            max_iter=200, random_state=42
        ),
        'KNN': lambda: KNeighborsClassifier(
            n_neighbors=5
        ),
        'MLP': lambda: MLPClassifier(
            hidden_layer_sizes=(100,), max_iter=200, random_state=42
        ),
        'SVM_RBF': lambda: SVC(
            kernel='rbf', probability=True, random_state=42
        ),
    }
    
    def __init__(self, portfolio: Dict[str, Callable] = None,
                 n_jobs: int = 1):
        """
        Parameters:
            portfolio: Dict mapping names to estimator factories
            n_jobs: Number of parallel jobs for CV
        """
        self.portfolio = portfolio or self.DEFAULT_PORTFOLIO
        self.n_jobs = n_jobs
        
        self.fitted_models = {}
        self.cv_scores = {}
        self.training_times = {}
        
    def evaluate(self, X: np.ndarray, y: np.ndarray,
                 cv: int = 5, verbose: bool = True) -> Dict[str, float]:
        """
        Evaluate all algorithms in the portfolio using CV.
        
        Parameters:
            X: Feature matrix
            y: Target vector
            cv: Number of CV folds
            verbose: Print progress
            
        Returns:
            Dict mapping algorithm names to CV scores
        """
        results = {}
        
        for name, estimator_factory in self.portfolio.items():
            start_time = time.time()
            
            try:
                estimator = estimator_factory()
                scores = cross_val_score(
                    estimator, X, y, cv=cv, 
                    scoring='accuracy', n_jobs=self.n_jobs
                )
                score = scores.mean()
                elapsed = time.time() - start_time
                
                results[name] = score
                self.cv_scores[name] = {'mean': score, 'std': scores.std()}
                self.training_times[name] = elapsed
                
                if verbose:
                    print(f"  {name}: {score:.4f} (±{scores.std():.4f}) "
                          f"in {elapsed:.1f}s")
                    
            except Exception as e:
                if verbose:
                    print(f"  {name}: FAILED ({e})")
                results[name] = np.nan
        
        return results
    
    def fit(self, X: np.ndarray, y: np.ndarray,
            algorithms: List[str] = None, verbose: bool = True):
        """
        Fit models from the portfolio on the full training data.
        
        Parameters:
            X: Training features
            y: Training labels
            algorithms: Which algorithms to fit (default: all)
            verbose: Print progress
        """
        if algorithms is None:
            algorithms = list(self.portfolio.keys())
        
        for name in algorithms:
            if name not in self.portfolio:
                continue
            
            start_time = time.time()
            
            try:
                estimator = self.portfolio[name]()
                estimator.fit(X, y)
                self.fitted_models[name] = estimator
                
                if verbose:
                    print(f"  Fitted {name} in {time.time()-start_time:.1f}s")
                    
            except Exception as e:
                if verbose:
                    print(f"  Failed to fit {name}: {e}")
    
    def predict_individual(self, X: np.ndarray) -> Dict[str, np.ndarray]:
        """
        Get predictions from each fitted model.
        """
        predictions = {}
        for name, model in self.fitted_models.items():
            predictions[name] = model.predict(X)
        return predictions
    
    def predict_proba_individual(self, X: np.ndarray) -> Dict[str, np.ndarray]:
        """
        Get probability predictions from each fitted model.
        """
        predictions = {}
        for name, model in self.fitted_models.items():
            if hasattr(model, 'predict_proba'):
                predictions[name] = model.predict_proba(X)
        return predictions
    
    def predict_voting(self, X: np.ndarray,
                       weights: Dict[str, float] = None) -> np.ndarray:
        """
        Ensemble prediction using (weighted) voting.
        
        Parameters:
            X: Features to predict
            weights: Optional weights per algorithm (default: equal)
            
        Returns:
            Predicted class labels
        """
        predictions = self.predict_individual(X)
        
        if not predictions:
            raise ValueError("No models fitted!")
        
        # Stack predictions
        pred_matrix = np.array(list(predictions.values()))  # (n_models, n_samples)
        
        if weights is None:
            # Equal voting
            from scipy.stats import mode
            ensemble_pred, _ = mode(pred_matrix, axis=0)
            return ensemble_pred.flatten()
        else:
            # Weighted voting
            weight_arr = np.array([weights.get(name, 1.0) 
                                    for name in predictions.keys()])
            weight_arr = weight_arr / weight_arr.sum()
            
            # Get unique classes
            all_classes = np.unique(pred_matrix)
            
            # Weighted vote per class
            weighted_votes = np.zeros((len(X), len(all_classes)))
            for i, (name, pred) in enumerate(predictions.items()):
                for j, c in enumerate(all_classes):
                    weighted_votes[:, j] += weight_arr[i] * (pred == c)
            
            return all_classes[np.argmax(weighted_votes, axis=1)]
    
    def predict_averaging(self, X: np.ndarray,
                          weights: Dict[str, float] = None) -> np.ndarray:
        """
        Ensemble prediction using probability averaging.
        """
        probas = self.predict_proba_individual(X)
        
        if not probas:
            raise ValueError("No models with predict_proba!")
        
        if weights is None:
            weights = {name: 1.0 for name in probas}
        
        total_weight = sum(weights.get(name, 0) for name in probas)
        
        # Average probabilities
        avg_proba = None
        for name, proba in probas.items():
            w = weights.get(name, 1.0) / total_weight
            if avg_proba is None:
                avg_proba = w * proba
            else:
                avg_proba += w * proba
        
        return np.argmax(avg_proba, axis=1)
    
    def get_best_algorithm(self) -> Tuple[str, float]:
        """
        Return the best single algorithm based on CV scores.
        """
        if not self.cv_scores:
            raise ValueError("Run evaluate() first!")
        
        best_name = max(self.cv_scores, key=lambda k: self.cv_scores[k]['mean'])
        return best_name, self.cv_scores[best_name]['mean']

Static Portfolio Design Principles
Principle	Description	Example
Coverage	Include algorithms for different data types	Linear model for sparse data, tree for interactions
Diversity	Different inductive biases	Decision tree vs neural net vs distance-based
Efficiency	Computationally tractable algorithms	Random forest > SVM for large datasets
Robustness	Good default hyperparameters	XGBoost defaults work well broadly

Dynamic Portfolio Construction

While static portfolios are simple, they ignore dataset-specific information. Dynamic portfolios adapt to the dataset at hand, selecting which algorithms to include based on meta-learning predictions.

The Dynamic Portfolio Problem:

Given a computational budget B and a dataset D, select a subset P ⊆ A of algorithms such that:

Running all algorithms in P stays within budget B
The performance of the portfolio P is maximized

Approaches:

1. Marginal Contribution Ranking

Rank algorithms by their expected marginal contribution to the portfolio:

How much does adding algorithm a improve the portfolio?
Consider both individual performance and diversity

2. Set Cover Formulation

Frame as a set cover problem:

Each algorithm "covers" datasets where it performs well
Find minimum set of algorithms that covers all dataset types
This is NP-hard but good greedy approximations exist

3. Meta-Learning Guided Selection

Use meta-learning to predict:

Which algorithms will perform well on this dataset?
Include those in the portfolio
Skip algorithms predicted to underperform

dynamic_portfolio.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
import numpy as np
from typing import List, Dict, Tuple, Callable
 
class DynamicPortfolioSelector:
    """
    Dynamic portfolio construction based on meta-learning predictions.
    
    Adapts the portfolio to include algorithms most likely to
    perform well on the given dataset.
    """
    
    def __init__(self, all_algorithms: Dict[str, Callable],
                 algorithm_costs: Dict[str, float] = None):
        """
        Parameters:
            all_algorithms: Full algorithm portfolio to select from
            algorithm_costs: Training time estimates per algorithm
        """
        self.all_algorithms = all_algorithms
        self.algorithm_costs = algorithm_costs or {
            name: 1.0 for name in all_algorithms
        }
        
        # Meta-learner for performance prediction
        self.meta_learner = None
        
    def fit_meta_learner(self, meta_features: np.ndarray,
                         performance_matrix: np.ndarray,
                         algorithm_names: List[str]):
        """
        Train the meta-learner on historical data.
        
        Parameters:
            meta_features: (n_datasets, n_meta_features) array
            performance_matrix: (n_datasets, n_algorithms) array
            algorithm_names: List of algorithm names
        """
        from sklearn.multioutput import MultiOutputRegressor
        from sklearn.ensemble import RandomForestRegressor
        
        self.algorithm_names = algorithm_names
        
        # Handle missing values in performance matrix
        perf_filled = np.nan_to_num(performance_matrix, nan=0.5)
        
        # Train multi-output regressor
        self.meta_learner = MultiOutputRegressor(
            RandomForestRegressor(n_estimators=50, random_state=42)
        )
        self.meta_learner.fit(meta_features, perf_filled)
    
    def select_portfolio(self, meta_features: np.ndarray,
                         budget: float,
                         min_algorithms: int = 3,
                         diversity_weight: float = 0.3) -> List[str]:
        """
        Select a portfolio for a new dataset.
        
        Parameters:
            meta_features: Meta-features of the new dataset
            budget: Computational budget (sum of algorithm costs)
            min_algorithms: Minimum algorithms to include
            diversity_weight: Weight for diversity vs predicted performance
            
        Returns:
            List of algorithm names to include in portfolio
        """
        if self.meta_learner is None:
            raise ValueError("Fit meta-learner first!")
        
        # Predict performances
        mf = meta_features.reshape(1, -1)
        predicted_perfs = self.meta_learner.predict(mf)[0]
        
        # Map to algorithm names
        algo_predictions = dict(zip(self.algorithm_names, predicted_perfs))
        
        # Greedy selection with budget constraint
        selected = []
        remaining_budget = budget
        
        # Filter to available algorithms
        available = [name for name in self.algorithm_names 
                     if name in self.all_algorithms]
        
        while remaining_budget > 0 and len(available) > 0:
            # Score each candidate
            scores = {}
            for name in available:
                if self.algorithm_costs[name] > remaining_budget:
                    continue
                
                # Performance contribution
                perf_score = algo_predictions.get(name, 0.5)
                
                # Diversity contribution (how different from selected?)
                if selected:
                    diversity_score = self._diversity_score(name, selected)
                else:
                    diversity_score = 1.0
                
                # Combined score
                scores[name] = ((1 - diversity_weight) * perf_score + 
                               diversity_weight * diversity_score)
            
            if not scores:
                break
            
            # Select best
            best_algo = max(scores, key=scores.get)
            selected.append(best_algo)
            remaining_budget -= self.algorithm_costs[best_algo]
            available.remove(best_algo)
            
            # Stop if we have enough
            if len(selected) >= min_algorithms and remaining_budget < min(
                self.algorithm_costs.get(a, float('inf')) for a in available
            ):
                break
        
        return selected
    
    def _diversity_score(self, candidate: str, selected: List[str]) -> float:
        """
        Compute how different candidate is from already selected algorithms.
        
        Uses algorithm type grouping as a proxy for diversity.
        """
        # Group algorithms by type
        algo_types = {
            'RandomForest': 'tree_ensemble',
            'ExtraTrees': 'tree_ensemble',
            'GradientBoosting': 'boosting',
            'XGBoost': 'boosting',
            'LightGBM': 'boosting',
            'LogisticRegression': 'linear',
            'Ridge': 'linear',
            'SVM_RBF': 'kernel',
            'SVM_Linear': 'linear',
            'KNN': 'distance',
            'MLP': 'neural_network',
        }
        
        candidate_type = algo_types.get(candidate, 'other')
        selected_types = [algo_types.get(s, 'other') for s in selected]
        
        # High diversity if type not already in selected
        if candidate_type not in selected_types:
            return 1.0
        else:
            # Lower diversity if type already present
            count = selected_types.count(candidate_type)
            return 1.0 / (1 + count)
    
    def build_portfolio(self, meta_features: np.ndarray,
                        budget: float) -> 'StaticPortfolio':
        """
        Build a portfolio object for the selected algorithms.
        
        Returns:
            StaticPortfolio with selected algorithms
        """
        selected = self.select_portfolio(meta_features, budget)
        
        portfolio_algos = {
            name: self.all_algorithms[name]
            for name in selected
        }
        
        return StaticPortfolio(portfolio=portfolio_algos)
 
 
class PortfolioOptimizer:
    """
    Optimize portfolio composition using historical data.
    
    Finds the portfolio that maximizes expected performance
    across a distribution of datasets.
    """
    
    def __init__(self, max_portfolio_size: int = 5):
        self.max_size = max_portfolio_size
        
    def optimize_greedy(self, performance_matrix: np.ndarray,
                        algorithm_names: List[str]) -> List[str]:
        """
        Greedy portfolio optimization.
        
        Iteratively add the algorithm that most improves portfolio performance.
        
        Parameters:
            performance_matrix: (n_datasets, n_algorithms) array
            algorithm_names: List of algorithm names
            
        Returns:
            Selected algorithm names
        """
        n_datasets, n_algos = performance_matrix.shape
        
        selected_indices = []
        available_indices = list(range(n_algos))
        
        for _ in range(min(self.max_size, n_algos)):
            best_idx = None
            best_improvement = float('-inf')
            
            for idx in available_indices:
                # Performance with this algorithm added
                trial_selection = selected_indices + [idx]
                
                # Portfolio performance = max over selected (for each dataset)
                portfolio_perf = np.nanmax(
                    performance_matrix[:, trial_selection], axis=1
                )
                avg_perf = np.nanmean(portfolio_perf)
                
                # Current performance
                if selected_indices:
                    current_perf = np.nanmax(
                        performance_matrix[:, selected_indices], axis=1
                    )
                    current_avg = np.nanmean(current_perf)
                else:
                    current_avg = 0
                
                improvement = avg_perf - current_avg
                
                if improvement > best_improvement:
                    best_improvement = improvement
                    best_idx = idx
            
            if best_idx is not None and best_improvement > 0:
                selected_indices.append(best_idx)
                available_indices.remove(best_idx)
            else:
                break
        
        return [algorithm_names[i] for i in selected_indices]

Portfolio Size Trade-offs

Larger portfolios have higher computational cost but lower risk of missing the best algorithm. The optimal size depends on the computational budget and the diversity of Datasets in your domain. Studies suggest 3-7 algorithms often suffice for tabular data.

Algorithm Scheduling

Algorithm scheduling determines not just which algorithms to run, but in what order and for how long. With limited budget, we want to:

Try promising algorithms first (in case we run out of time)
Allocate more time to algorithms likely to benefit from tuning
Stop early on algorithms that are clearly underperforming

The Scheduling Problem:

Given:

A set of algorithms A with unknown performance distributions
A total computation budget B
Algorithms that can be stopped early (before full training)

Find a schedule S = [(a₁, t₁), (a₂, t₂), ...] such that:

Total time Σtᵢ ≤ B
Expected maximum performance is maximized

Approaches:

1. Round-Robin Scheduling

Simple baseline: give equal time to each algorithm.

Easy to implement
Fair, no algorithm is starved
Doesn't adapt to performance differences

2. Priority-Based Scheduling

Run algorithms in order of expected performance (from meta-learning).

High-priority algorithms get full budget if needed
Low-priority algorithms might not run at all
Risk: meta-learning predictions might be wrong

algorithm_scheduling.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
import numpy as np
from typing import List, Dict, Tuple, Callable
from dataclasses import dataclass
import time
 
@dataclass
class ScheduleEntry:
    """An entry in the algorithm schedule."""
    algorithm: str
    budget: float  # Time or iterations
    priority: float  # Higher = run earlier
 
 
class AlgorithmScheduler:
    """
    Algorithm scheduling for portfolio execution.
    
    Determines the order and budget allocation for running
    multiple algorithms within a time constraint.
    """
    
    def __init__(self, algorithms: Dict[str, Callable],
                 total_budget: float,
                 strategy: str = 'adaptive'):
        """
        Parameters:
            algorithms: Dict mapping names to estimator factories
            total_budget: Total time budget (seconds)
            strategy: 'round_robin', 'priority', or 'adaptive'
        """
        self.algorithms = algorithms
        self.total_budget = total_budget
        self.strategy = strategy
        
    def create_schedule(self, 
                        priorities: Dict[str, float] = None,
                        time_estimates: Dict[str, float] = None) -> List[ScheduleEntry]:
        """
        Create an execution schedule.
        
        Parameters:
            priorities: Higher values = run earlier (from meta-learning)
            time_estimates: Expected time per algorithm
            
        Returns:
            Ordered list of ScheduleEntry objects
        """
        algo_names = list(self.algorithms.keys())
        n_algos = len(algo_names)
        
        # Default priorities (equal)
        if priorities is None:
            priorities = {name: 1.0 for name in algo_names}
        
        # Default time estimates (equal)
        if time_estimates is None:
            time_estimates = {name: self.total_budget / n_algos for name in algo_names}
        
        if self.strategy == 'round_robin':
            return self._schedule_round_robin(algo_names, time_estimates)
        elif self.strategy == 'priority':
            return self._schedule_priority(algo_names, priorities, time_estimates)
        elif self.strategy == 'adaptive':
            return self._schedule_adaptive(algo_names, priorities, time_estimates)
        else:
            raise ValueError(f"Unknown strategy: {self.strategy}")
    
    def _schedule_round_robin(self, algo_names: List[str],
                              time_estimates: Dict[str, float]) -> List[ScheduleEntry]:
        """Equal time allocation in arbitrary order."""
        budget_per_algo = self.total_budget / len(algo_names)
        
        return [
            ScheduleEntry(
                algorithm=name,
                budget=min(budget_per_algo, time_estimates[name]),
                priority=1.0
            )
            for name in algo_names
        ]
    
    def _schedule_priority(self, algo_names: List[str],
                           priorities: Dict[str, float],
                           time_estimates: Dict[str, float]) -> List[ScheduleEntry]:
        """Run in order of priority, allocate time proportionally."""
        # Sort by priority
        sorted_names = sorted(algo_names, key=lambda n: priorities.get(n, 0), reverse=True)
        
        # Allocate budget proportional to priority
        total_priority = sum(priorities.values())
        schedule = []
        
        for name in sorted_names:
            priority = priorities.get(name, 1.0)
            budget = (priority / total_priority) * self.total_budget
            
            # Respect time estimates
            budget = min(budget, time_estimates[name])
            
            schedule.append(ScheduleEntry(
                algorithm=name,
                budget=budget,
                priority=priority
            ))
        
        return schedule
    
    def _schedule_adaptive(self, algo_names: List[str],
                           priorities: Dict[str, float],
                           time_estimates: Dict[str, float]) -> List[ScheduleEntry]:
        """
        Adaptive scheduling: ensures all algorithms get minimum budget,
        then allocates remaining budget by priority.
        """
        n_algos = len(algo_names)
        
        # Minimum budget per algorithm (ensures we try everything)
        min_budget = self.total_budget * 0.1 / n_algos
        
        # Remaining budget for priority allocation
        remaining = self.total_budget * 0.9
        
        # Sort by priority
        sorted_names = sorted(algo_names, key=lambda n: priorities.get(n, 0), reverse=True)
        
        schedule = []
        total_priority = sum(priorities.values())
        
        for name in sorted_names:
            priority = priorities.get(name, 1.0)
            priority_budget = (priority / total_priority) * remaining
            total_budget = min_budget + priority_budget
            
            # Respect time estimates
            total_budget = min(total_budget, time_estimates[name])
            
            schedule.append(ScheduleEntry(
                algorithm=name,
                budget=total_budget,
                priority=priority
            ))
        
        return schedule
    
    def execute(self, schedule: List[ScheduleEntry],
                X_train, y_train, X_val, y_val,
                verbose: bool = True) -> Dict[str, Tuple[Any, float]]:
        """
        Execute the schedule.
        
        Parameters:
            schedule: Execution schedule
            X_train, y_train: Training data
            X_val, y_val: Validation data
            verbose: Print progress
            
        Returns:
            Dict mapping algorithm names to (fitted_model, val_score) tuples
        """
        results = {}
        remaining_budget = self.total_budget
        
        for entry in schedule:
            if remaining_budget <= 0:
                if verbose:
                    print(f"  Budget exhausted, skipping {entry.algorithm}")
                break
            
            if verbose:
                print(f"  Running {entry.algorithm} "
                      f"(budget: {min(entry.budget, remaining_budget):.1f}s)")
            
            start_time = time.time()
            
            try:
                # Get estimator
                estimator = self.algorithms[entry.algorithm]()
                
                # Fit with timeout (simplified - real impl would use multiprocessing)
                estimator.fit(X_train, y_train)
                
                # Score on validation
                score = estimator.score(X_val, y_val)
                
                elapsed = time.time() - start_time
                remaining_budget -= elapsed
                
                results[entry.algorithm] = (estimator, score)
                
                if verbose:
                    print(f"    Score: {score:.4f} (took {elapsed:.1f}s)")
                    
            except Exception as e:
                if verbose:
                    print(f"    FAILED: {e}")
                elapsed = time.time() - start_time
                remaining_budget -= elapsed
        
        return results

3. Bandit-Based Scheduling

Treat algorithm scheduling as a multi-armed bandit problem:

Each algorithm is an arm
Reward is validation performance
Use UCB or Thompson Sampling to balance exploration/exploitation

4. Successive Halving for Portfolios

Apply Hyperband-style successive halving across algorithms:

Start all algorithms with minimal budget
Eliminate worst performers
Give more budget to survivors
Repeat until budget exhausted

This naturally allocates more resources to better algorithms.

Racing Algorithms

The F-Race algorithm (from irace) applies statistical tests to eliminate algorithms as soon as they are significantly worse than the best. This can dramatically reduce computation by stopping poor algorithms early with statistical confidence.

Ensemble Selection

After running a portfolio, we have multiple trained models. Ensemble selection chooses which models to include in the final ensemble and how to weight them.

Why Ensemble?

Diversity bonus: Different algorithms make different errors; combining them reduces variance
Robustness: Ensemble is less sensitive to any single model's weaknesses
Better calibration: Averaged probabilities are often better calibrated

Auto-sklearn's Ensemble Selection (Caruana et al.):

A greedy forward selection algorithm:

Start with empty ensemble
For each iteration:
- For each model in the library:
  - Tentatively add model to ensemble
  - Evaluate ensemble performance on validation set
- Add the model that improves ensemble most
Repeat until ensemble size limit or no improvement
Models can be added multiple times (effectively weighting)

This is computationally efficient (O(n × T × S) for T iterations, S models) and works well in practice.

ensemble_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
import numpy as np
from sklearn.metrics import accuracy_score, log_loss
from typing import List, Dict, Tuple, Any
 
class EnsembleSelection:
    """
    Ensemble Selection algorithm from Caruana et al.
    
    Greedily builds an ensemble by iteratively adding the model
    that most improves ensemble performance on validation data.
    """
    
    def __init__(self, ensemble_size: int = 50,
                 metric: str = 'accuracy'):
        """
        Parameters:
            ensemble_size: Maximum models in ensemble (with replacement)
            metric: 'accuracy' or 'log_loss'
        """
        self.ensemble_size = ensemble_size
        self.metric = metric
        
    def fit(self, predictions: Dict[str, np.ndarray],
            y_true: np.ndarray,
            model_objects: Dict[str, Any] = None) -> 'EnsembleSelection':
        """
        Build the ensemble.
        
        Parameters:
            predictions: Dict mapping model names to validation predictions
                         Shape: (n_samples,) for class predictions or
                                (n_samples, n_classes) for probabilities
            y_true: True validation labels
            model_objects: Actual fitted model objects for prediction
            
        Returns:
            self
        """
        self.model_names = list(predictions.keys())
        self.model_objects = model_objects or {}
        
        # Store predictions
        self.predictions = predictions
        self.y_true = y_true
        
        # Convert to probability format if needed
        self.probas = {}
        n_classes = len(np.unique(y_true))
        
        for name, pred in predictions.items():
            if pred.ndim == 1:
                # Convert class predictions to one-hot probabilities
                proba = np.zeros((len(pred), n_classes))
                for i, p in enumerate(pred):
                    proba[i, int(p)] = 1.0
                self.probas[name] = proba
            else:
                self.probas[name] = pred
        
        # Greedy selection
        self.ensemble_members = []  # List of model names (with possible repeats)
        self.weights = {}  # Computed after selection
        
        best_score = float('-inf') if self.metric == 'accuracy' else float('inf')
        
        for iteration in range(self.ensemble_size):
            best_candidate = None
            best_candidate_score = best_score
            
            for name in self.model_names:
                # Tentatively add this model
                trial_members = self.ensemble_members + [name]
                
                # Compute ensemble prediction
                ensemble_proba = self._compute_ensemble_proba(trial_members)
                score = self._evaluate(ensemble_proba, y_true)
                
                # Check if this is an improvement
                if self.metric == 'accuracy':
                    is_better = score > best_candidate_score
                else:  # log_loss (lower is better)
                    is_better = score < best_candidate_score
                
                if is_better:
                    best_candidate = name
                    best_candidate_score = score
            
            # Add best candidate if it improves ensemble
            if best_candidate is not None:
                if self.metric == 'accuracy':
                    is_improvement = best_candidate_score > best_score
                else:
                    is_improvement = best_candidate_score < best_score
                
                if is_improvement or len(self.ensemble_members) < 3:
                    self.ensemble_members.append(best_candidate)
                    best_score = best_candidate_score
                else:
                    break  # No improvement, stop
        
        # Compute final weights
        self._compute_weights()
        
        return self
    
    def _compute_ensemble_proba(self, members: List[str]) -> np.ndarray:
        """Average probabilities over ensemble members."""
        if not members:
            # Return uniform if empty
            n_samples = len(self.y_true)
            n_classes = list(self.probas.values())[0].shape[1]
            return np.ones((n_samples, n_classes)) / n_classes
        
        proba_sum = np.zeros_like(list(self.probas.values())[0])
        for member in members:
            proba_sum += self.probas[member]
        
        return proba_sum / len(members)
    
    def _evaluate(self, proba: np.ndarray, y_true: np.ndarray) -> float:
        """Evaluate predictions."""
        if self.metric == 'accuracy':
            predictions = np.argmax(proba, axis=1)
            return accuracy_score(y_true, predictions)
        else:
            return log_loss(y_true, proba)
    
    def _compute_weights(self):
        """Compute effective weights for each unique model."""
        self.weights = {}
        total = len(self.ensemble_members)
        
        for name in set(self.ensemble_members):
            count = self.ensemble_members.count(name)
            self.weights[name] = count / total
    
    def predict(self, X: np.ndarray) -> np.ndarray:
        """
        Make predictions with the ensemble.
        
        Parameters:
            X: Features to predict
            
        Returns:
            Predicted class labels
        """
        proba = self.predict_proba(X)
        return np.argmax(proba, axis=1)
    
    def predict_proba(self, X: np.ndarray) -> np.ndarray:
        """
        Get probability predictions from ensemble.
        """
        if not self.model_objects:
            raise ValueError("No model objects available for prediction!")
        
        # Weighted average of predictions
        weighted_proba = None
        
        for name, weight in self.weights.items():
            if name not in self.model_objects:
                continue
            
            model = self.model_objects[name]
            
            if hasattr(model, 'predict_proba'):
                proba = model.predict_proba(X)
            else:
                # Convert to one-hot
                pred = model.predict(X)
                n_classes = len(np.unique(self.y_true))
                proba = np.zeros((len(pred), n_classes))
                for i, p in enumerate(pred):
                    proba[i, int(p)] = 1.0
            
            if weighted_proba is None:
                weighted_proba = weight * proba
            else:
                weighted_proba += weight * proba
        
        return weighted_proba
    
    def summary(self) -> str:
        """Return a summary of the ensemble."""
        lines = ["Ensemble Summary:"]
        lines.append(f"  Total members: {len(self.ensemble_members)}")
        lines.append(f"  Unique models: {len(self.weights)}")
        lines.append("  Weights:")
        for name, weight in sorted(self.weights.items(), key=lambda x: -x[1]):
            lines.append(f"    {name}: {weight:.3f}")
        return "
".join(lines)
 
 
def build_stacked_ensemble(base_models: Dict[str, Any],
                           X_train: np.ndarray, y_train: np.ndarray,
                           X_val: np.ndarray, y_val: np.ndarray,
                           meta_model_factory = None):
    """
    Build a stacked ensemble (two-level).
    
    Level 0: Base models make predictions
    Level 1: Meta-model learns to combine base predictions
    """
    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import cross_val_predict
    
    if meta_model_factory is None:
        meta_model_factory = lambda: LogisticRegression(max_iter=200)
    
    # Get out-of-fold predictions for stacking
    base_predictions = {}
    fitted_models = {}
    
    for name, model in base_models.items():
        # Clone model
        from sklearn.base import clone
        model_clone = clone(model)
        
        # Get OOF predictions
        try:
            oof_pred = cross_val_predict(
                model_clone, X_train, y_train, cv=5, method='predict_proba'
            )
            base_predictions[name] = oof_pred
            
            # Fit on full training data
            model_clone = clone(model)
            model_clone.fit(X_train, y_train)
            fitted_models[name] = model_clone
        except Exception as e:
            print(f"Skipping {name}: {e}")
    
    # Create meta-features
    meta_X_train = np.hstack(list(base_predictions.values()))
    
    # Fit meta-model
    meta_model = meta_model_factory()
    meta_model.fit(meta_X_train, y_train)
    
    return fitted_models, meta_model

Ensemble Selection Advantages

•Automatically weights models by contribution
•Handles any number of base models
•Model can be selected multiple times (implicit weighting)
•Works with any model type

Ensemble Selection Limitations

•Greedy; may miss globally optimal ensemble
•Requires validation set (not all data used for training)
•Can overfit to validation set with many iterations
•Prediction time scales with ensemble size

Portfolios in Production AutoML Systems

Let's examine how production AutoML systems implement portfolio methods:

Auto-sklearn's Approach:

Initial portfolio: 15 classifiers, 14 preprocessors
Warm-start: Meta-learning selects promising configurations
SMAC optimization: Search for better configurations
Ensemble selection: Caruana-style greedy selection on validation predictions
Final ensemble: Typically 5-50 models, weighted by selection frequency

Key insight: Auto-sklearn never just returns one model—it always ensembles.

H2O AutoML's Approach:

Fixed progression: XGBoost → GBM → DRF → Deep Learning → GLM
Stacked ensemble: Combine all trained models via stacking
Best-of-family: Track best model of each type
Leaderboard: Rank all models by performance

Key insight: H2O emphasizes robustness—run many models, stack them all.

Google Cloud AutoML (Tables):

Neural architecture search: Search for best neural architecture
Gradient boosting: Also train gradient boosting as baseline
Ensemble: Combine neural and gradient boosting
Interpretability: Provide feature importance from ensemble

Key insight: Combine modern (neural) with robust (boosting) approaches.

Portfolio Strategies in Production Systems
System	Portfolio Approach	Ensemble Method	Key Strength
Auto-sklearn	Meta-learning + SMAC	Caruana selection	Sample efficiency
H2O AutoML	Fixed progression	Stacking	Robustness
TPOT	Genetic evolution	Pipeline optimization	Flexibility
AutoGluon	Multi-layer stacking	Repeated stacking	Performance
Google AutoML	NAS + boosting	Weighted ensemble	Scalability

AutoGluon's Multi-Layer Stacking

AutoGluon takes stacking to extreme: multiple layers of stacking with repeated k-fold cross-validation. Each layer's predictions become features for the next layer. This achieves state-of-the-art performance on tabular data benchmarks but requires significant computation.

Summary: Portfolio Methods

We've explored portfolio methods as an alternative to single-algorithm selection. Let's consolidate the key takeaways:

Key Takeaways

•Portfolio philosophy: Run multiple algorithms instead of betting on one; diversify risk like in finance.
•Static portfolios use fixed algorithm sets with broad coverage; dynamic portfolios adapt to the dataset.
•Algorithm scheduling optimizes execution order and budget allocation across portfolio members.
•Ensemble selection (Caruana) greedily combines models for better than single-model performance.
•Production systems combine portfolios with meta-learning, optimization, and sophisticated ensembling.

Module Complete:

This concludes Module 3: Automated Model Selection. You've learned:

Algorithm Selection: Rice's framework, meta-features, selection approaches
CASH: Unifying algorithm selection with hyperparameter optimization
Meta-Learning: Learning from past experiments to predict algorithm performance
Warm Starting: Accelerating optimization with transferred configurations
Portfolio Methods: Running and combining multiple algorithms for robustness

These concepts form the foundation of modern AutoML systems, enabling them to efficiently find high-performing machine learning solutions with minimal human intervention.

Module Complete

You now have a comprehensive understanding of automated model selection—from theoretical foundations (Rice's framework, CASH) through practical implementations (meta-learning, warm-starting, portfolios). This knowledge equips you to use AutoML systems effectively, understand their decisions, and extend them for specialized applications.

Portfolio Methods

Don't Put All Your Eggs in One Algorithm

Algorithm selection asks: Which single algorithm should I use? But there's an alternative philosophy that often proves more robust: Why choose just one?

The Investment Analogy:

Single algorithm: High variance in performance across datasets
Portfolio of algorithms: Lower variance, more consistent results

Why Portfolios Work in AutoML:

Selection is uncertain: Even good meta-learning can't perfectly predict the best algorithm
Close performances: Often, several algorithms perform similarly; selection doesn't matter much
Ensembling bonus: Combining diverse algorithms often beats any single one
Robustness: Portfolios protect against catastrophic selection failures

This page explores portfolio construction, algorithm scheduling, and ensemble selection—completing our toolkit for automated model selection.

What You Will Learn

Static Algorithm Portfolios

A static portfolio is a fixed set of algorithms run on every dataset. The philosophy: some algorithms are so generally good that they should always be tried.

Designing Static Portfolios:

A good static portfolio has:

Coverage: Includes algorithms that excel on different dataset types
Diversity: Algorithms with different inductive biases that make different errors
Efficiency: Algorithms that are computationally tractable
Robustness: Algorithms with sensible default hyperparameters

Common Static Portfolios:

Tabular Data Portfolio:

Gradient Boosting (XGBoost/LightGBM/CatBoost)
Random Forest
Linear Model (Logistic Regression / Ridge)
Neural Network (MLP)
k-Nearest Neighbors (for baseline)

H2O AutoML's Portfolio:

XGBoost (various hyperparameter configurations)
LightGBM
CatBoost
GBM (proprietary H2O implementation)
Random Forest
Extremely Randomized Trees
Deep Learning (MLP)
GLM (Generalized Linear Model)
Stacked Ensemble (combines all above)

static_portfolio.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
import numpy as np
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.ensemble import ExtraTreesClassifier, AdaBoostClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
from typing import Dict, List, Tuple, Any, Callable
import time
 
class StaticPortfolio:
    """
    Static algorithm portfolio for classification.
    
    Runs a fixed set of algorithms and provides:
    1. Individual predictions from each algorithm
    2. Ensemble prediction (voting or stacking)
    3. Best single algorithm selection
    """
    
    # Default portfolio with sensible defaults
    DEFAULT_PORTFOLIO = {
        'RandomForest': lambda: RandomForestClassifier(
            n_estimators=100, max_depth=None, random_state=42
        ),
        'GradientBoosting': lambda: GradientBoostingClassifier(
            n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42
        ),
        'ExtraTrees': lambda: ExtraTreesClassifier(
            n_estimators=100, random_state=42
        ),
        'LogisticRegression': lambda: LogisticRegression(
            max_iter=200, random_state=42
        ),
        'KNN': lambda: KNeighborsClassifier(
            n_neighbors=5
        ),
        'MLP': lambda: MLPClassifier(
            hidden_layer_sizes=(100,), max_iter=200, random_state=42
        ),
        'SVM_RBF': lambda: SVC(
            kernel='rbf', probability=True, random_state=42
        ),
    }
    
    def __init__(self, portfolio: Dict[str, Callable] = None,
                 n_jobs: int = 1):
        """
        Parameters:
            portfolio: Dict mapping names to estimator factories
            n_jobs: Number of parallel jobs for CV
        """
        self.portfolio = portfolio or self.DEFAULT_PORTFOLIO
        self.n_jobs = n_jobs
        
        self.fitted_models = {}
        self.cv_scores = {}
        self.training_times = {}
        
    def evaluate(self, X: np.ndarray, y: np.ndarray,
                 cv: int = 5, verbose: bool = True) -> Dict[str, float]:
        """
        Evaluate all algorithms in the portfolio using CV.
        
        Parameters:
            X: Feature matrix
            y: Target vector
            cv: Number of CV folds
            verbose: Print progress
            
        Returns:
            Dict mapping algorithm names to CV scores
        """
        results = {}
        
        for name, estimator_factory in self.portfolio.items():
            start_time = time.time()
            
            try:
                estimator = estimator_factory()
                scores = cross_val_score(
                    estimator, X, y, cv=cv, 
                    scoring='accuracy', n_jobs=self.n_jobs
                )
                score = scores.mean()
                elapsed = time.time() - start_time
                
                results[name] = score
                self.cv_scores[name] = {'mean': score, 'std': scores.std()}
                self.training_times[name] = elapsed
                
                if verbose:
                    print(f"  {name}: {score:.4f} (±{scores.std():.4f}) "
                          f"in {elapsed:.1f}s")
                    
            except Exception as e:
                if verbose:
                    print(f"  {name}: FAILED ({e})")
                results[name] = np.nan
        
        return results
    
    def fit(self, X: np.ndarray, y: np.ndarray,
            algorithms: List[str] = None, verbose: bool = True):
        """
        Fit models from the portfolio on the full training data.
        
        Parameters:
            X: Training features
            y: Training labels
            algorithms: Which algorithms to fit (default: all)
            verbose: Print progress
        """
        if algorithms is None:
            algorithms = list(self.portfolio.keys())
        
        for name in algorithms:
            if name not in self.portfolio:
                continue
            
            start_time = time.time()
            
            try:
                estimator = self.portfolio[name]()
                estimator.fit(X, y)
                self.fitted_models[name] = estimator
                
                if verbose:
                    print(f"  Fitted {name} in {time.time()-start_time:.1f}s")
                    
            except Exception as e:
                if verbose:
                    print(f"  Failed to fit {name}: {e}")
    
    def predict_individual(self, X: np.ndarray) -> Dict[str, np.ndarray]:
        """
        Get predictions from each fitted model.
        """
        predictions = {}
        for name, model in self.fitted_models.items():
            predictions[name] = model.predict(X)
        return predictions
    
    def predict_proba_individual(self, X: np.ndarray) -> Dict[str, np.ndarray]:
        """
        Get probability predictions from each fitted model.
        """
        predictions = {}
        for name, model in self.fitted_models.items():
            if hasattr(model, 'predict_proba'):
                predictions[name] = model.predict_proba(X)
        return predictions
    
    def predict_voting(self, X: np.ndarray,
                       weights: Dict[str, float] = None) -> np.ndarray:
        """
        Ensemble prediction using (weighted) voting.
        
        Parameters:
            X: Features to predict
            weights: Optional weights per algorithm (default: equal)
            
        Returns:
            Predicted class labels
        """
        predictions = self.predict_individual(X)
        
        if not predictions:
            raise ValueError("No models fitted!")
        
        # Stack predictions
        pred_matrix = np.array(list(predictions.values()))  # (n_models, n_samples)
        
        if weights is None:
            # Equal voting
            from scipy.stats import mode
            ensemble_pred, _ = mode(pred_matrix, axis=0)
            return ensemble_pred.flatten()
        else:
            # Weighted voting
            weight_arr = np.array([weights.get(name, 1.0) 
                                    for name in predictions.keys()])
            weight_arr = weight_arr / weight_arr.sum()
            
            # Get unique classes
            all_classes = np.unique(pred_matrix)
            
            # Weighted vote per class
            weighted_votes = np.zeros((len(X), len(all_classes)))
            for i, (name, pred) in enumerate(predictions.items()):
                for j, c in enumerate(all_classes):
                    weighted_votes[:, j] += weight_arr[i] * (pred == c)
            
            return all_classes[np.argmax(weighted_votes, axis=1)]
    
    def predict_averaging(self, X: np.ndarray,
                          weights: Dict[str, float] = None) -> np.ndarray:
        """
        Ensemble prediction using probability averaging.
        """
        probas = self.predict_proba_individual(X)
        
        if not probas:
            raise ValueError("No models with predict_proba!")
        
        if weights is None:
            weights = {name: 1.0 for name in probas}
        
        total_weight = sum(weights.get(name, 0) for name in probas)
        
        # Average probabilities
        avg_proba = None
        for name, proba in probas.items():
            w = weights.get(name, 1.0) / total_weight
            if avg_proba is None:
                avg_proba = w * proba
            else:
                avg_proba += w * proba
        
        return np.argmax(avg_proba, axis=1)
    
    def get_best_algorithm(self) -> Tuple[str, float]:
        """
        Return the best single algorithm based on CV scores.
        """
        if not self.cv_scores:
            raise ValueError("Run evaluate() first!")
        
        best_name = max(self.cv_scores, key=lambda k: self.cv_scores[k]['mean'])
        return best_name, self.cv_scores[best_name]['mean']

Static Portfolio Design Principles
Principle	Description	Example
Coverage	Include algorithms for different data types	Linear model for sparse data, tree for interactions
Diversity	Different inductive biases	Decision tree vs neural net vs distance-based
Efficiency	Computationally tractable algorithms	Random forest > SVM for large datasets
Robustness	Good default hyperparameters	XGBoost defaults work well broadly

Dynamic Portfolio Construction

The Dynamic Portfolio Problem:

Given a computational budget B and a dataset D, select a subset P ⊆ A of algorithms such that:

Running all algorithms in P stays within budget B
The performance of the portfolio P is maximized

Approaches:

1. Marginal Contribution Ranking

Rank algorithms by their expected marginal contribution to the portfolio:

How much does adding algorithm a improve the portfolio?
Consider both individual performance and diversity

2. Set Cover Formulation

Frame as a set cover problem:

Each algorithm "covers" datasets where it performs well
Find minimum set of algorithms that covers all dataset types
This is NP-hard but good greedy approximations exist

3. Meta-Learning Guided Selection

Use meta-learning to predict:

Which algorithms will perform well on this dataset?
Include those in the portfolio
Skip algorithms predicted to underperform

dynamic_portfolio.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
import numpy as np
from typing import List, Dict, Tuple, Callable
 
class DynamicPortfolioSelector:
    """
    Dynamic portfolio construction based on meta-learning predictions.
    
    Adapts the portfolio to include algorithms most likely to
    perform well on the given dataset.
    """
    
    def __init__(self, all_algorithms: Dict[str, Callable],
                 algorithm_costs: Dict[str, float] = None):
        """
        Parameters:
            all_algorithms: Full algorithm portfolio to select from
            algorithm_costs: Training time estimates per algorithm
        """
        self.all_algorithms = all_algorithms
        self.algorithm_costs = algorithm_costs or {
            name: 1.0 for name in all_algorithms
        }
        
        # Meta-learner for performance prediction
        self.meta_learner = None
        
    def fit_meta_learner(self, meta_features: np.ndarray,
                         performance_matrix: np.ndarray,
                         algorithm_names: List[str]):
        """
        Train the meta-learner on historical data.
        
        Parameters:
            meta_features: (n_datasets, n_meta_features) array
            performance_matrix: (n_datasets, n_algorithms) array
            algorithm_names: List of algorithm names
        """
        from sklearn.multioutput import MultiOutputRegressor
        from sklearn.ensemble import RandomForestRegressor
        
        self.algorithm_names = algorithm_names
        
        # Handle missing values in performance matrix
        perf_filled = np.nan_to_num(performance_matrix, nan=0.5)
        
        # Train multi-output regressor
        self.meta_learner = MultiOutputRegressor(
            RandomForestRegressor(n_estimators=50, random_state=42)
        )
        self.meta_learner.fit(meta_features, perf_filled)
    
    def select_portfolio(self, meta_features: np.ndarray,
                         budget: float,
                         min_algorithms: int = 3,
                         diversity_weight: float = 0.3) -> List[str]:
        """
        Select a portfolio for a new dataset.
        
        Parameters:
            meta_features: Meta-features of the new dataset
            budget: Computational budget (sum of algorithm costs)
            min_algorithms: Minimum algorithms to include
            diversity_weight: Weight for diversity vs predicted performance
            
        Returns:
            List of algorithm names to include in portfolio
        """
        if self.meta_learner is None:
            raise ValueError("Fit meta-learner first!")
        
        # Predict performances
        mf = meta_features.reshape(1, -1)
        predicted_perfs = self.meta_learner.predict(mf)[0]
        
        # Map to algorithm names
        algo_predictions = dict(zip(self.algorithm_names, predicted_perfs))
        
        # Greedy selection with budget constraint
        selected = []
        remaining_budget = budget
        
        # Filter to available algorithms
        available = [name for name in self.algorithm_names 
                     if name in self.all_algorithms]
        
        while remaining_budget > 0 and len(available) > 0:
            # Score each candidate
            scores = {}
            for name in available:
                if self.algorithm_costs[name] > remaining_budget:
                    continue
                
                # Performance contribution
                perf_score = algo_predictions.get(name, 0.5)
                
                # Diversity contribution (how different from selected?)
                if selected:
                    diversity_score = self._diversity_score(name, selected)
                else:
                    diversity_score = 1.0
                
                # Combined score
                scores[name] = ((1 - diversity_weight) * perf_score + 
                               diversity_weight * diversity_score)
            
            if not scores:
                break
            
            # Select best
            best_algo = max(scores, key=scores.get)
            selected.append(best_algo)
            remaining_budget -= self.algorithm_costs[best_algo]
            available.remove(best_algo)
            
            # Stop if we have enough
            if len(selected) >= min_algorithms and remaining_budget < min(
                self.algorithm_costs.get(a, float('inf')) for a in available
            ):
                break
        
        return selected
    
    def _diversity_score(self, candidate: str, selected: List[str]) -> float:
        """
        Compute how different candidate is from already selected algorithms.
        
        Uses algorithm type grouping as a proxy for diversity.
        """
        # Group algorithms by type
        algo_types = {
            'RandomForest': 'tree_ensemble',
            'ExtraTrees': 'tree_ensemble',
            'GradientBoosting': 'boosting',
            'XGBoost': 'boosting',
            'LightGBM': 'boosting',
            'LogisticRegression': 'linear',
            'Ridge': 'linear',
            'SVM_RBF': 'kernel',
            'SVM_Linear': 'linear',
            'KNN': 'distance',
            'MLP': 'neural_network',
        }
        
        candidate_type = algo_types.get(candidate, 'other')
        selected_types = [algo_types.get(s, 'other') for s in selected]
        
        # High diversity if type not already in selected
        if candidate_type not in selected_types:
            return 1.0
        else:
            # Lower diversity if type already present
            count = selected_types.count(candidate_type)
            return 1.0 / (1 + count)
    
    def build_portfolio(self, meta_features: np.ndarray,
                        budget: float) -> 'StaticPortfolio':
        """
        Build a portfolio object for the selected algorithms.
        
        Returns:
            StaticPortfolio with selected algorithms
        """
        selected = self.select_portfolio(meta_features, budget)
        
        portfolio_algos = {
            name: self.all_algorithms[name]
            for name in selected
        }
        
        return StaticPortfolio(portfolio=portfolio_algos)
 
 
class PortfolioOptimizer:
    """
    Optimize portfolio composition using historical data.
    
    Finds the portfolio that maximizes expected performance
    across a distribution of datasets.
    """
    
    def __init__(self, max_portfolio_size: int = 5):
        self.max_size = max_portfolio_size
        
    def optimize_greedy(self, performance_matrix: np.ndarray,
                        algorithm_names: List[str]) -> List[str]:
        """
        Greedy portfolio optimization.
        
        Iteratively add the algorithm that most improves portfolio performance.
        
        Parameters:
            performance_matrix: (n_datasets, n_algorithms) array
            algorithm_names: List of algorithm names
            
        Returns:
            Selected algorithm names
        """
        n_datasets, n_algos = performance_matrix.shape
        
        selected_indices = []
        available_indices = list(range(n_algos))
        
        for _ in range(min(self.max_size, n_algos)):
            best_idx = None
            best_improvement = float('-inf')
            
            for idx in available_indices:
                # Performance with this algorithm added
                trial_selection = selected_indices + [idx]
                
                # Portfolio performance = max over selected (for each dataset)
                portfolio_perf = np.nanmax(
                    performance_matrix[:, trial_selection], axis=1
                )
                avg_perf = np.nanmean(portfolio_perf)
                
                # Current performance
                if selected_indices:
                    current_perf = np.nanmax(
                        performance_matrix[:, selected_indices], axis=1
                    )
                    current_avg = np.nanmean(current_perf)
                else:
                    current_avg = 0
                
                improvement = avg_perf - current_avg
                
                if improvement > best_improvement:
                    best_improvement = improvement
                    best_idx = idx
            
            if best_idx is not None and best_improvement > 0:
                selected_indices.append(best_idx)
                available_indices.remove(best_idx)
            else:
                break
        
        return [algorithm_names[i] for i in selected_indices]

Portfolio Size Trade-offs

Algorithm Scheduling

Algorithm scheduling determines not just which algorithms to run, but in what order and for how long. With limited budget, we want to:

Try promising algorithms first (in case we run out of time)
Allocate more time to algorithms likely to benefit from tuning
Stop early on algorithms that are clearly underperforming

The Scheduling Problem:

Given:

A set of algorithms A with unknown performance distributions
A total computation budget B
Algorithms that can be stopped early (before full training)

Find a schedule S = [(a₁, t₁), (a₂, t₂), ...] such that:

Total time Σtᵢ ≤ B
Expected maximum performance is maximized

Approaches:

1. Round-Robin Scheduling

Simple baseline: give equal time to each algorithm.

Easy to implement
Fair, no algorithm is starved
Doesn't adapt to performance differences

2. Priority-Based Scheduling

Run algorithms in order of expected performance (from meta-learning).

High-priority algorithms get full budget if needed
Low-priority algorithms might not run at all
Risk: meta-learning predictions might be wrong

algorithm_scheduling.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
import numpy as np
from typing import List, Dict, Tuple, Callable
from dataclasses import dataclass
import time
 
@dataclass
class ScheduleEntry:
    """An entry in the algorithm schedule."""
    algorithm: str
    budget: float  # Time or iterations
    priority: float  # Higher = run earlier
 
 
class AlgorithmScheduler:
    """
    Algorithm scheduling for portfolio execution.
    
    Determines the order and budget allocation for running
    multiple algorithms within a time constraint.
    """
    
    def __init__(self, algorithms: Dict[str, Callable],
                 total_budget: float,
                 strategy: str = 'adaptive'):
        """
        Parameters:
            algorithms: Dict mapping names to estimator factories
            total_budget: Total time budget (seconds)
            strategy: 'round_robin', 'priority', or 'adaptive'
        """
        self.algorithms = algorithms
        self.total_budget = total_budget
        self.strategy = strategy
        
    def create_schedule(self, 
                        priorities: Dict[str, float] = None,
                        time_estimates: Dict[str, float] = None) -> List[ScheduleEntry]:
        """
        Create an execution schedule.
        
        Parameters:
            priorities: Higher values = run earlier (from meta-learning)
            time_estimates: Expected time per algorithm
            
        Returns:
            Ordered list of ScheduleEntry objects
        """
        algo_names = list(self.algorithms.keys())
        n_algos = len(algo_names)
        
        # Default priorities (equal)
        if priorities is None:
            priorities = {name: 1.0 for name in algo_names}
        
        # Default time estimates (equal)
        if time_estimates is None:
            time_estimates = {name: self.total_budget / n_algos for name in algo_names}
        
        if self.strategy == 'round_robin':
            return self._schedule_round_robin(algo_names, time_estimates)
        elif self.strategy == 'priority':
            return self._schedule_priority(algo_names, priorities, time_estimates)
        elif self.strategy == 'adaptive':
            return self._schedule_adaptive(algo_names, priorities, time_estimates)
        else:
            raise ValueError(f"Unknown strategy: {self.strategy}")
    
    def _schedule_round_robin(self, algo_names: List[str],
                              time_estimates: Dict[str, float]) -> List[ScheduleEntry]:
        """Equal time allocation in arbitrary order."""
        budget_per_algo = self.total_budget / len(algo_names)
        
        return [
            ScheduleEntry(
                algorithm=name,
                budget=min(budget_per_algo, time_estimates[name]),
                priority=1.0
            )
            for name in algo_names
        ]
    
    def _schedule_priority(self, algo_names: List[str],
                           priorities: Dict[str, float],
                           time_estimates: Dict[str, float]) -> List[ScheduleEntry]:
        """Run in order of priority, allocate time proportionally."""
        # Sort by priority
        sorted_names = sorted(algo_names, key=lambda n: priorities.get(n, 0), reverse=True)
        
        # Allocate budget proportional to priority
        total_priority = sum(priorities.values())
        schedule = []
        
        for name in sorted_names:
            priority = priorities.get(name, 1.0)
            budget = (priority / total_priority) * self.total_budget
            
            # Respect time estimates
            budget = min(budget, time_estimates[name])
            
            schedule.append(ScheduleEntry(
                algorithm=name,
                budget=budget,
                priority=priority
            ))
        
        return schedule
    
    def _schedule_adaptive(self, algo_names: List[str],
                           priorities: Dict[str, float],
                           time_estimates: Dict[str, float]) -> List[ScheduleEntry]:
        """
        Adaptive scheduling: ensures all algorithms get minimum budget,
        then allocates remaining budget by priority.
        """
        n_algos = len(algo_names)
        
        # Minimum budget per algorithm (ensures we try everything)
        min_budget = self.total_budget * 0.1 / n_algos
        
        # Remaining budget for priority allocation
        remaining = self.total_budget * 0.9
        
        # Sort by priority
        sorted_names = sorted(algo_names, key=lambda n: priorities.get(n, 0), reverse=True)
        
        schedule = []
        total_priority = sum(priorities.values())
        
        for name in sorted_names:
            priority = priorities.get(name, 1.0)
            priority_budget = (priority / total_priority) * remaining
            total_budget = min_budget + priority_budget
            
            # Respect time estimates
            total_budget = min(total_budget, time_estimates[name])
            
            schedule.append(ScheduleEntry(
                algorithm=name,
                budget=total_budget,
                priority=priority
            ))
        
        return schedule
    
    def execute(self, schedule: List[ScheduleEntry],
                X_train, y_train, X_val, y_val,
                verbose: bool = True) -> Dict[str, Tuple[Any, float]]:
        """
        Execute the schedule.
        
        Parameters:
            schedule: Execution schedule
            X_train, y_train: Training data
            X_val, y_val: Validation data
            verbose: Print progress
            
        Returns:
            Dict mapping algorithm names to (fitted_model, val_score) tuples
        """
        results = {}
        remaining_budget = self.total_budget
        
        for entry in schedule:
            if remaining_budget <= 0:
                if verbose:
                    print(f"  Budget exhausted, skipping {entry.algorithm}")
                break
            
            if verbose:
                print(f"  Running {entry.algorithm} "
                      f"(budget: {min(entry.budget, remaining_budget):.1f}s)")
            
            start_time = time.time()
            
            try:
                # Get estimator
                estimator = self.algorithms[entry.algorithm]()
                
                # Fit with timeout (simplified - real impl would use multiprocessing)
                estimator.fit(X_train, y_train)
                
                # Score on validation
                score = estimator.score(X_val, y_val)
                
                elapsed = time.time() - start_time
                remaining_budget -= elapsed
                
                results[entry.algorithm] = (estimator, score)
                
                if verbose:
                    print(f"    Score: {score:.4f} (took {elapsed:.1f}s)")
                    
            except Exception as e:
                if verbose:
                    print(f"    FAILED: {e}")
                elapsed = time.time() - start_time
                remaining_budget -= elapsed
        
        return results

3. Bandit-Based Scheduling

Treat algorithm scheduling as a multi-armed bandit problem:

Each algorithm is an arm
Reward is validation performance
Use UCB or Thompson Sampling to balance exploration/exploitation

4. Successive Halving for Portfolios

Apply Hyperband-style successive halving across algorithms:

Start all algorithms with minimal budget
Eliminate worst performers
Give more budget to survivors
Repeat until budget exhausted

This naturally allocates more resources to better algorithms.

Racing Algorithms

Ensemble Selection

After running a portfolio, we have multiple trained models. Ensemble selection chooses which models to include in the final ensemble and how to weight them.

Why Ensemble?

Diversity bonus: Different algorithms make different errors; combining them reduces variance
Robustness: Ensemble is less sensitive to any single model's weaknesses
Better calibration: Averaged probabilities are often better calibrated

Auto-sklearn's Ensemble Selection (Caruana et al.):

A greedy forward selection algorithm:

Start with empty ensemble
For each iteration:
- For each model in the library:
  - Tentatively add model to ensemble
  - Evaluate ensemble performance on validation set
- Add the model that improves ensemble most
Repeat until ensemble size limit or no improvement
Models can be added multiple times (effectively weighting)

This is computationally efficient (O(n × T × S) for T iterations, S models) and works well in practice.

ensemble_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
import numpy as np
from sklearn.metrics import accuracy_score, log_loss
from typing import List, Dict, Tuple, Any
 
class EnsembleSelection:
    """
    Ensemble Selection algorithm from Caruana et al.
    
    Greedily builds an ensemble by iteratively adding the model
    that most improves ensemble performance on validation data.
    """
    
    def __init__(self, ensemble_size: int = 50,
                 metric: str = 'accuracy'):
        """
        Parameters:
            ensemble_size: Maximum models in ensemble (with replacement)
            metric: 'accuracy' or 'log_loss'
        """
        self.ensemble_size = ensemble_size
        self.metric = metric
        
    def fit(self, predictions: Dict[str, np.ndarray],
            y_true: np.ndarray,
            model_objects: Dict[str, Any] = None) -> 'EnsembleSelection':
        """
        Build the ensemble.
        
        Parameters:
            predictions: Dict mapping model names to validation predictions
                         Shape: (n_samples,) for class predictions or
                                (n_samples, n_classes) for probabilities
            y_true: True validation labels
            model_objects: Actual fitted model objects for prediction
            
        Returns:
            self
        """
        self.model_names = list(predictions.keys())
        self.model_objects = model_objects or {}
        
        # Store predictions
        self.predictions = predictions
        self.y_true = y_true
        
        # Convert to probability format if needed
        self.probas = {}
        n_classes = len(np.unique(y_true))
        
        for name, pred in predictions.items():
            if pred.ndim == 1:
                # Convert class predictions to one-hot probabilities
                proba = np.zeros((len(pred), n_classes))
                for i, p in enumerate(pred):
                    proba[i, int(p)] = 1.0
                self.probas[name] = proba
            else:
                self.probas[name] = pred
        
        # Greedy selection
        self.ensemble_members = []  # List of model names (with possible repeats)
        self.weights = {}  # Computed after selection
        
        best_score = float('-inf') if self.metric == 'accuracy' else float('inf')
        
        for iteration in range(self.ensemble_size):
            best_candidate = None
            best_candidate_score = best_score
            
            for name in self.model_names:
                # Tentatively add this model
                trial_members = self.ensemble_members + [name]
                
                # Compute ensemble prediction
                ensemble_proba = self._compute_ensemble_proba(trial_members)
                score = self._evaluate(ensemble_proba, y_true)
                
                # Check if this is an improvement
                if self.metric == 'accuracy':
                    is_better = score > best_candidate_score
                else:  # log_loss (lower is better)
                    is_better = score < best_candidate_score
                
                if is_better:
                    best_candidate = name
                    best_candidate_score = score
            
            # Add best candidate if it improves ensemble
            if best_candidate is not None:
                if self.metric == 'accuracy':
                    is_improvement = best_candidate_score > best_score
                else:
                    is_improvement = best_candidate_score < best_score
                
                if is_improvement or len(self.ensemble_members) < 3:
                    self.ensemble_members.append(best_candidate)
                    best_score = best_candidate_score
                else:
                    break  # No improvement, stop
        
        # Compute final weights
        self._compute_weights()
        
        return self
    
    def _compute_ensemble_proba(self, members: List[str]) -> np.ndarray:
        """Average probabilities over ensemble members."""
        if not members:
            # Return uniform if empty
            n_samples = len(self.y_true)
            n_classes = list(self.probas.values())[0].shape[1]
            return np.ones((n_samples, n_classes)) / n_classes
        
        proba_sum = np.zeros_like(list(self.probas.values())[0])
        for member in members:
            proba_sum += self.probas[member]
        
        return proba_sum / len(members)
    
    def _evaluate(self, proba: np.ndarray, y_true: np.ndarray) -> float:
        """Evaluate predictions."""
        if self.metric == 'accuracy':
            predictions = np.argmax(proba, axis=1)
            return accuracy_score(y_true, predictions)
        else:
            return log_loss(y_true, proba)
    
    def _compute_weights(self):
        """Compute effective weights for each unique model."""
        self.weights = {}
        total = len(self.ensemble_members)
        
        for name in set(self.ensemble_members):
            count = self.ensemble_members.count(name)
            self.weights[name] = count / total
    
    def predict(self, X: np.ndarray) -> np.ndarray:
        """
        Make predictions with the ensemble.
        
        Parameters:
            X: Features to predict
            
        Returns:
            Predicted class labels
        """
        proba = self.predict_proba(X)
        return np.argmax(proba, axis=1)
    
    def predict_proba(self, X: np.ndarray) -> np.ndarray:
        """
        Get probability predictions from ensemble.
        """
        if not self.model_objects:
            raise ValueError("No model objects available for prediction!")
        
        # Weighted average of predictions
        weighted_proba = None
        
        for name, weight in self.weights.items():
            if name not in self.model_objects:
                continue
            
            model = self.model_objects[name]
            
            if hasattr(model, 'predict_proba'):
                proba = model.predict_proba(X)
            else:
                # Convert to one-hot
                pred = model.predict(X)
                n_classes = len(np.unique(self.y_true))
                proba = np.zeros((len(pred), n_classes))
                for i, p in enumerate(pred):
                    proba[i, int(p)] = 1.0
            
            if weighted_proba is None:
                weighted_proba = weight * proba
            else:
                weighted_proba += weight * proba
        
        return weighted_proba
    
    def summary(self) -> str:
        """Return a summary of the ensemble."""
        lines = ["Ensemble Summary:"]
        lines.append(f"  Total members: {len(self.ensemble_members)}")
        lines.append(f"  Unique models: {len(self.weights)}")
        lines.append("  Weights:")
        for name, weight in sorted(self.weights.items(), key=lambda x: -x[1]):
            lines.append(f"    {name}: {weight:.3f}")
        return "
".join(lines)
 
 
def build_stacked_ensemble(base_models: Dict[str, Any],
                           X_train: np.ndarray, y_train: np.ndarray,
                           X_val: np.ndarray, y_val: np.ndarray,
                           meta_model_factory = None):
    """
    Build a stacked ensemble (two-level).
    
    Level 0: Base models make predictions
    Level 1: Meta-model learns to combine base predictions
    """
    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import cross_val_predict
    
    if meta_model_factory is None:
        meta_model_factory = lambda: LogisticRegression(max_iter=200)
    
    # Get out-of-fold predictions for stacking
    base_predictions = {}
    fitted_models = {}
    
    for name, model in base_models.items():
        # Clone model
        from sklearn.base import clone
        model_clone = clone(model)
        
        # Get OOF predictions
        try:
            oof_pred = cross_val_predict(
                model_clone, X_train, y_train, cv=5, method='predict_proba'
            )
            base_predictions[name] = oof_pred
            
            # Fit on full training data
            model_clone = clone(model)
            model_clone.fit(X_train, y_train)
            fitted_models[name] = model_clone
        except Exception as e:
            print(f"Skipping {name}: {e}")
    
    # Create meta-features
    meta_X_train = np.hstack(list(base_predictions.values()))
    
    # Fit meta-model
    meta_model = meta_model_factory()
    meta_model.fit(meta_X_train, y_train)
    
    return fitted_models, meta_model

Ensemble Selection Advantages

•Automatically weights models by contribution
•Handles any number of base models
•Model can be selected multiple times (implicit weighting)
•Works with any model type

Ensemble Selection Limitations

•Greedy; may miss globally optimal ensemble
•Requires validation set (not all data used for training)
•Can overfit to validation set with many iterations
•Prediction time scales with ensemble size

Portfolios in Production AutoML Systems

Let's examine how production AutoML systems implement portfolio methods:

Auto-sklearn's Approach:

Initial portfolio: 15 classifiers, 14 preprocessors
Warm-start: Meta-learning selects promising configurations
SMAC optimization: Search for better configurations
Ensemble selection: Caruana-style greedy selection on validation predictions
Final ensemble: Typically 5-50 models, weighted by selection frequency

Key insight: Auto-sklearn never just returns one model—it always ensembles.

H2O AutoML's Approach:

Fixed progression: XGBoost → GBM → DRF → Deep Learning → GLM
Stacked ensemble: Combine all trained models via stacking
Best-of-family: Track best model of each type
Leaderboard: Rank all models by performance

Key insight: H2O emphasizes robustness—run many models, stack them all.

Google Cloud AutoML (Tables):

Neural architecture search: Search for best neural architecture
Gradient boosting: Also train gradient boosting as baseline
Ensemble: Combine neural and gradient boosting
Interpretability: Provide feature importance from ensemble

Key insight: Combine modern (neural) with robust (boosting) approaches.

Portfolio Strategies in Production Systems
System	Portfolio Approach	Ensemble Method	Key Strength
Auto-sklearn	Meta-learning + SMAC	Caruana selection	Sample efficiency
H2O AutoML	Fixed progression	Stacking	Robustness
TPOT	Genetic evolution	Pipeline optimization	Flexibility
AutoGluon	Multi-layer stacking	Repeated stacking	Performance
Google AutoML	NAS + boosting	Weighted ensemble	Scalability

AutoGluon's Multi-Layer Stacking

Summary: Portfolio Methods

We've explored portfolio methods as an alternative to single-algorithm selection. Let's consolidate the key takeaways:

Key Takeaways

•Portfolio philosophy: Run multiple algorithms instead of betting on one; diversify risk like in finance.
•Static portfolios use fixed algorithm sets with broad coverage; dynamic portfolios adapt to the dataset.
•Algorithm scheduling optimizes execution order and budget allocation across portfolio members.
•Ensemble selection (Caruana) greedily combines models for better than single-model performance.
•Production systems combine portfolios with meta-learning, optimization, and sophisticated ensembling.

Module Complete:

This concludes Module 3: Automated Model Selection. You've learned:

Algorithm Selection: Rice's framework, meta-features, selection approaches
CASH: Unifying algorithm selection with hyperparameter optimization
Meta-Learning: Learning from past experiments to predict algorithm performance
Warm Starting: Accelerating optimization with transferred configurations
Portfolio Methods: Running and combining multiple algorithms for robustness

These concepts form the foundation of modern AutoML systems, enabling them to efficiently find high-performing machine learning solutions with minimal human intervention.

Module Complete