Loading content...
Algorithm selection asks: Which single algorithm should I use? But there's an alternative philosophy that often proves more robust: Why choose just one?
Portfolio methods run multiple algorithms—a carefully chosen set—and combine their predictions. Instead of betting everything on one algorithm, we diversify. The result is often more robust and competitive with the best single choice.
The Investment Analogy:
In finance, portfolios diversify risk. A diversified stock portfolio rarely beats the single best stock, but it consistently outperforms the average stock and protects against catastrophic losses. The same principle applies to machine learning:
Why Portfolios Work in AutoML:
This page explores portfolio construction, algorithm scheduling, and ensemble selection—completing our toolkit for automated model selection.
By the end of this page, you will understand the portfolio approach to AutoML, static vs dynamic portfolio construction, algorithm schedule optimization, ensemble selection for combining trained models, and how systems like H2O AutoML and Auto-sklearn implement portfolios.
A static portfolio is a fixed set of algorithms run on every dataset. The philosophy: some algorithms are so generally good that they should always be tried.
Designing Static Portfolios:
A good static portfolio has:
Common Static Portfolios:
Tabular Data Portfolio:
H2O AutoML's Portfolio:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231
import numpy as npfrom sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifierfrom sklearn.ensemble import ExtraTreesClassifier, AdaBoostClassifierfrom sklearn.linear_model import LogisticRegressionfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.neural_network import MLPClassifierfrom sklearn.svm import SVCfrom sklearn.model_selection import cross_val_scorefrom typing import Dict, List, Tuple, Any, Callableimport time class StaticPortfolio: """ Static algorithm portfolio for classification. Runs a fixed set of algorithms and provides: 1. Individual predictions from each algorithm 2. Ensemble prediction (voting or stacking) 3. Best single algorithm selection """ # Default portfolio with sensible defaults DEFAULT_PORTFOLIO = { 'RandomForest': lambda: RandomForestClassifier( n_estimators=100, max_depth=None, random_state=42 ), 'GradientBoosting': lambda: GradientBoostingClassifier( n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42 ), 'ExtraTrees': lambda: ExtraTreesClassifier( n_estimators=100, random_state=42 ), 'LogisticRegression': lambda: LogisticRegression( max_iter=200, random_state=42 ), 'KNN': lambda: KNeighborsClassifier( n_neighbors=5 ), 'MLP': lambda: MLPClassifier( hidden_layer_sizes=(100,), max_iter=200, random_state=42 ), 'SVM_RBF': lambda: SVC( kernel='rbf', probability=True, random_state=42 ), } def __init__(self, portfolio: Dict[str, Callable] = None, n_jobs: int = 1): """ Parameters: portfolio: Dict mapping names to estimator factories n_jobs: Number of parallel jobs for CV """ self.portfolio = portfolio or self.DEFAULT_PORTFOLIO self.n_jobs = n_jobs self.fitted_models = {} self.cv_scores = {} self.training_times = {} def evaluate(self, X: np.ndarray, y: np.ndarray, cv: int = 5, verbose: bool = True) -> Dict[str, float]: """ Evaluate all algorithms in the portfolio using CV. Parameters: X: Feature matrix y: Target vector cv: Number of CV folds verbose: Print progress Returns: Dict mapping algorithm names to CV scores """ results = {} for name, estimator_factory in self.portfolio.items(): start_time = time.time() try: estimator = estimator_factory() scores = cross_val_score( estimator, X, y, cv=cv, scoring='accuracy', n_jobs=self.n_jobs ) score = scores.mean() elapsed = time.time() - start_time results[name] = score self.cv_scores[name] = {'mean': score, 'std': scores.std()} self.training_times[name] = elapsed if verbose: print(f" {name}: {score:.4f} (±{scores.std():.4f}) " f"in {elapsed:.1f}s") except Exception as e: if verbose: print(f" {name}: FAILED ({e})") results[name] = np.nan return results def fit(self, X: np.ndarray, y: np.ndarray, algorithms: List[str] = None, verbose: bool = True): """ Fit models from the portfolio on the full training data. Parameters: X: Training features y: Training labels algorithms: Which algorithms to fit (default: all) verbose: Print progress """ if algorithms is None: algorithms = list(self.portfolio.keys()) for name in algorithms: if name not in self.portfolio: continue start_time = time.time() try: estimator = self.portfolio[name]() estimator.fit(X, y) self.fitted_models[name] = estimator if verbose: print(f" Fitted {name} in {time.time()-start_time:.1f}s") except Exception as e: if verbose: print(f" Failed to fit {name}: {e}") def predict_individual(self, X: np.ndarray) -> Dict[str, np.ndarray]: """ Get predictions from each fitted model. """ predictions = {} for name, model in self.fitted_models.items(): predictions[name] = model.predict(X) return predictions def predict_proba_individual(self, X: np.ndarray) -> Dict[str, np.ndarray]: """ Get probability predictions from each fitted model. """ predictions = {} for name, model in self.fitted_models.items(): if hasattr(model, 'predict_proba'): predictions[name] = model.predict_proba(X) return predictions def predict_voting(self, X: np.ndarray, weights: Dict[str, float] = None) -> np.ndarray: """ Ensemble prediction using (weighted) voting. Parameters: X: Features to predict weights: Optional weights per algorithm (default: equal) Returns: Predicted class labels """ predictions = self.predict_individual(X) if not predictions: raise ValueError("No models fitted!") # Stack predictions pred_matrix = np.array(list(predictions.values())) # (n_models, n_samples) if weights is None: # Equal voting from scipy.stats import mode ensemble_pred, _ = mode(pred_matrix, axis=0) return ensemble_pred.flatten() else: # Weighted voting weight_arr = np.array([weights.get(name, 1.0) for name in predictions.keys()]) weight_arr = weight_arr / weight_arr.sum() # Get unique classes all_classes = np.unique(pred_matrix) # Weighted vote per class weighted_votes = np.zeros((len(X), len(all_classes))) for i, (name, pred) in enumerate(predictions.items()): for j, c in enumerate(all_classes): weighted_votes[:, j] += weight_arr[i] * (pred == c) return all_classes[np.argmax(weighted_votes, axis=1)] def predict_averaging(self, X: np.ndarray, weights: Dict[str, float] = None) -> np.ndarray: """ Ensemble prediction using probability averaging. """ probas = self.predict_proba_individual(X) if not probas: raise ValueError("No models with predict_proba!") if weights is None: weights = {name: 1.0 for name in probas} total_weight = sum(weights.get(name, 0) for name in probas) # Average probabilities avg_proba = None for name, proba in probas.items(): w = weights.get(name, 1.0) / total_weight if avg_proba is None: avg_proba = w * proba else: avg_proba += w * proba return np.argmax(avg_proba, axis=1) def get_best_algorithm(self) -> Tuple[str, float]: """ Return the best single algorithm based on CV scores. """ if not self.cv_scores: raise ValueError("Run evaluate() first!") best_name = max(self.cv_scores, key=lambda k: self.cv_scores[k]['mean']) return best_name, self.cv_scores[best_name]['mean']| Principle | Description | Example |
|---|---|---|
| Coverage | Include algorithms for different data types | Linear model for sparse data, tree for interactions |
| Diversity | Different inductive biases | Decision tree vs neural net vs distance-based |
| Efficiency | Computationally tractable algorithms | Random forest > SVM for large datasets |
| Robustness | Good default hyperparameters | XGBoost defaults work well broadly |
While static portfolios are simple, they ignore dataset-specific information. Dynamic portfolios adapt to the dataset at hand, selecting which algorithms to include based on meta-learning predictions.
The Dynamic Portfolio Problem:
Given a computational budget B and a dataset D, select a subset P ⊆ A of algorithms such that:
Approaches:
1. Marginal Contribution Ranking
Rank algorithms by their expected marginal contribution to the portfolio:
2. Set Cover Formulation
Frame as a set cover problem:
3. Meta-Learning Guided Selection
Use meta-learning to predict:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238
import numpy as npfrom typing import List, Dict, Tuple, Callable class DynamicPortfolioSelector: """ Dynamic portfolio construction based on meta-learning predictions. Adapts the portfolio to include algorithms most likely to perform well on the given dataset. """ def __init__(self, all_algorithms: Dict[str, Callable], algorithm_costs: Dict[str, float] = None): """ Parameters: all_algorithms: Full algorithm portfolio to select from algorithm_costs: Training time estimates per algorithm """ self.all_algorithms = all_algorithms self.algorithm_costs = algorithm_costs or { name: 1.0 for name in all_algorithms } # Meta-learner for performance prediction self.meta_learner = None def fit_meta_learner(self, meta_features: np.ndarray, performance_matrix: np.ndarray, algorithm_names: List[str]): """ Train the meta-learner on historical data. Parameters: meta_features: (n_datasets, n_meta_features) array performance_matrix: (n_datasets, n_algorithms) array algorithm_names: List of algorithm names """ from sklearn.multioutput import MultiOutputRegressor from sklearn.ensemble import RandomForestRegressor self.algorithm_names = algorithm_names # Handle missing values in performance matrix perf_filled = np.nan_to_num(performance_matrix, nan=0.5) # Train multi-output regressor self.meta_learner = MultiOutputRegressor( RandomForestRegressor(n_estimators=50, random_state=42) ) self.meta_learner.fit(meta_features, perf_filled) def select_portfolio(self, meta_features: np.ndarray, budget: float, min_algorithms: int = 3, diversity_weight: float = 0.3) -> List[str]: """ Select a portfolio for a new dataset. Parameters: meta_features: Meta-features of the new dataset budget: Computational budget (sum of algorithm costs) min_algorithms: Minimum algorithms to include diversity_weight: Weight for diversity vs predicted performance Returns: List of algorithm names to include in portfolio """ if self.meta_learner is None: raise ValueError("Fit meta-learner first!") # Predict performances mf = meta_features.reshape(1, -1) predicted_perfs = self.meta_learner.predict(mf)[0] # Map to algorithm names algo_predictions = dict(zip(self.algorithm_names, predicted_perfs)) # Greedy selection with budget constraint selected = [] remaining_budget = budget # Filter to available algorithms available = [name for name in self.algorithm_names if name in self.all_algorithms] while remaining_budget > 0 and len(available) > 0: # Score each candidate scores = {} for name in available: if self.algorithm_costs[name] > remaining_budget: continue # Performance contribution perf_score = algo_predictions.get(name, 0.5) # Diversity contribution (how different from selected?) if selected: diversity_score = self._diversity_score(name, selected) else: diversity_score = 1.0 # Combined score scores[name] = ((1 - diversity_weight) * perf_score + diversity_weight * diversity_score) if not scores: break # Select best best_algo = max(scores, key=scores.get) selected.append(best_algo) remaining_budget -= self.algorithm_costs[best_algo] available.remove(best_algo) # Stop if we have enough if len(selected) >= min_algorithms and remaining_budget < min( self.algorithm_costs.get(a, float('inf')) for a in available ): break return selected def _diversity_score(self, candidate: str, selected: List[str]) -> float: """ Compute how different candidate is from already selected algorithms. Uses algorithm type grouping as a proxy for diversity. """ # Group algorithms by type algo_types = { 'RandomForest': 'tree_ensemble', 'ExtraTrees': 'tree_ensemble', 'GradientBoosting': 'boosting', 'XGBoost': 'boosting', 'LightGBM': 'boosting', 'LogisticRegression': 'linear', 'Ridge': 'linear', 'SVM_RBF': 'kernel', 'SVM_Linear': 'linear', 'KNN': 'distance', 'MLP': 'neural_network', } candidate_type = algo_types.get(candidate, 'other') selected_types = [algo_types.get(s, 'other') for s in selected] # High diversity if type not already in selected if candidate_type not in selected_types: return 1.0 else: # Lower diversity if type already present count = selected_types.count(candidate_type) return 1.0 / (1 + count) def build_portfolio(self, meta_features: np.ndarray, budget: float) -> 'StaticPortfolio': """ Build a portfolio object for the selected algorithms. Returns: StaticPortfolio with selected algorithms """ selected = self.select_portfolio(meta_features, budget) portfolio_algos = { name: self.all_algorithms[name] for name in selected } return StaticPortfolio(portfolio=portfolio_algos) class PortfolioOptimizer: """ Optimize portfolio composition using historical data. Finds the portfolio that maximizes expected performance across a distribution of datasets. """ def __init__(self, max_portfolio_size: int = 5): self.max_size = max_portfolio_size def optimize_greedy(self, performance_matrix: np.ndarray, algorithm_names: List[str]) -> List[str]: """ Greedy portfolio optimization. Iteratively add the algorithm that most improves portfolio performance. Parameters: performance_matrix: (n_datasets, n_algorithms) array algorithm_names: List of algorithm names Returns: Selected algorithm names """ n_datasets, n_algos = performance_matrix.shape selected_indices = [] available_indices = list(range(n_algos)) for _ in range(min(self.max_size, n_algos)): best_idx = None best_improvement = float('-inf') for idx in available_indices: # Performance with this algorithm added trial_selection = selected_indices + [idx] # Portfolio performance = max over selected (for each dataset) portfolio_perf = np.nanmax( performance_matrix[:, trial_selection], axis=1 ) avg_perf = np.nanmean(portfolio_perf) # Current performance if selected_indices: current_perf = np.nanmax( performance_matrix[:, selected_indices], axis=1 ) current_avg = np.nanmean(current_perf) else: current_avg = 0 improvement = avg_perf - current_avg if improvement > best_improvement: best_improvement = improvement best_idx = idx if best_idx is not None and best_improvement > 0: selected_indices.append(best_idx) available_indices.remove(best_idx) else: break return [algorithm_names[i] for i in selected_indices]Larger portfolios have higher computational cost but lower risk of missing the best algorithm. The optimal size depends on the computational budget and the diversity of Datasets in your domain. Studies suggest 3-7 algorithms often suffice for tabular data.
Algorithm scheduling determines not just which algorithms to run, but in what order and for how long. With limited budget, we want to:
The Scheduling Problem:
Given:
Find a schedule S = [(a₁, t₁), (a₂, t₂), ...] such that:
Approaches:
1. Round-Robin Scheduling
Simple baseline: give equal time to each algorithm.
2. Priority-Based Scheduling
Run algorithms in order of expected performance (from meta-learning).
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199
import numpy as npfrom typing import List, Dict, Tuple, Callablefrom dataclasses import dataclassimport time @dataclassclass ScheduleEntry: """An entry in the algorithm schedule.""" algorithm: str budget: float # Time or iterations priority: float # Higher = run earlier class AlgorithmScheduler: """ Algorithm scheduling for portfolio execution. Determines the order and budget allocation for running multiple algorithms within a time constraint. """ def __init__(self, algorithms: Dict[str, Callable], total_budget: float, strategy: str = 'adaptive'): """ Parameters: algorithms: Dict mapping names to estimator factories total_budget: Total time budget (seconds) strategy: 'round_robin', 'priority', or 'adaptive' """ self.algorithms = algorithms self.total_budget = total_budget self.strategy = strategy def create_schedule(self, priorities: Dict[str, float] = None, time_estimates: Dict[str, float] = None) -> List[ScheduleEntry]: """ Create an execution schedule. Parameters: priorities: Higher values = run earlier (from meta-learning) time_estimates: Expected time per algorithm Returns: Ordered list of ScheduleEntry objects """ algo_names = list(self.algorithms.keys()) n_algos = len(algo_names) # Default priorities (equal) if priorities is None: priorities = {name: 1.0 for name in algo_names} # Default time estimates (equal) if time_estimates is None: time_estimates = {name: self.total_budget / n_algos for name in algo_names} if self.strategy == 'round_robin': return self._schedule_round_robin(algo_names, time_estimates) elif self.strategy == 'priority': return self._schedule_priority(algo_names, priorities, time_estimates) elif self.strategy == 'adaptive': return self._schedule_adaptive(algo_names, priorities, time_estimates) else: raise ValueError(f"Unknown strategy: {self.strategy}") def _schedule_round_robin(self, algo_names: List[str], time_estimates: Dict[str, float]) -> List[ScheduleEntry]: """Equal time allocation in arbitrary order.""" budget_per_algo = self.total_budget / len(algo_names) return [ ScheduleEntry( algorithm=name, budget=min(budget_per_algo, time_estimates[name]), priority=1.0 ) for name in algo_names ] def _schedule_priority(self, algo_names: List[str], priorities: Dict[str, float], time_estimates: Dict[str, float]) -> List[ScheduleEntry]: """Run in order of priority, allocate time proportionally.""" # Sort by priority sorted_names = sorted(algo_names, key=lambda n: priorities.get(n, 0), reverse=True) # Allocate budget proportional to priority total_priority = sum(priorities.values()) schedule = [] for name in sorted_names: priority = priorities.get(name, 1.0) budget = (priority / total_priority) * self.total_budget # Respect time estimates budget = min(budget, time_estimates[name]) schedule.append(ScheduleEntry( algorithm=name, budget=budget, priority=priority )) return schedule def _schedule_adaptive(self, algo_names: List[str], priorities: Dict[str, float], time_estimates: Dict[str, float]) -> List[ScheduleEntry]: """ Adaptive scheduling: ensures all algorithms get minimum budget, then allocates remaining budget by priority. """ n_algos = len(algo_names) # Minimum budget per algorithm (ensures we try everything) min_budget = self.total_budget * 0.1 / n_algos # Remaining budget for priority allocation remaining = self.total_budget * 0.9 # Sort by priority sorted_names = sorted(algo_names, key=lambda n: priorities.get(n, 0), reverse=True) schedule = [] total_priority = sum(priorities.values()) for name in sorted_names: priority = priorities.get(name, 1.0) priority_budget = (priority / total_priority) * remaining total_budget = min_budget + priority_budget # Respect time estimates total_budget = min(total_budget, time_estimates[name]) schedule.append(ScheduleEntry( algorithm=name, budget=total_budget, priority=priority )) return schedule def execute(self, schedule: List[ScheduleEntry], X_train, y_train, X_val, y_val, verbose: bool = True) -> Dict[str, Tuple[Any, float]]: """ Execute the schedule. Parameters: schedule: Execution schedule X_train, y_train: Training data X_val, y_val: Validation data verbose: Print progress Returns: Dict mapping algorithm names to (fitted_model, val_score) tuples """ results = {} remaining_budget = self.total_budget for entry in schedule: if remaining_budget <= 0: if verbose: print(f" Budget exhausted, skipping {entry.algorithm}") break if verbose: print(f" Running {entry.algorithm} " f"(budget: {min(entry.budget, remaining_budget):.1f}s)") start_time = time.time() try: # Get estimator estimator = self.algorithms[entry.algorithm]() # Fit with timeout (simplified - real impl would use multiprocessing) estimator.fit(X_train, y_train) # Score on validation score = estimator.score(X_val, y_val) elapsed = time.time() - start_time remaining_budget -= elapsed results[entry.algorithm] = (estimator, score) if verbose: print(f" Score: {score:.4f} (took {elapsed:.1f}s)") except Exception as e: if verbose: print(f" FAILED: {e}") elapsed = time.time() - start_time remaining_budget -= elapsed return results3. Bandit-Based Scheduling
Treat algorithm scheduling as a multi-armed bandit problem:
4. Successive Halving for Portfolios
Apply Hyperband-style successive halving across algorithms:
This naturally allocates more resources to better algorithms.
The F-Race algorithm (from irace) applies statistical tests to eliminate algorithms as soon as they are significantly worse than the best. This can dramatically reduce computation by stopping poor algorithms early with statistical confidence.
After running a portfolio, we have multiple trained models. Ensemble selection chooses which models to include in the final ensemble and how to weight them.
Why Ensemble?
Auto-sklearn's Ensemble Selection (Caruana et al.):
A greedy forward selection algorithm:
This is computationally efficient (O(n × T × S) for T iterations, S models) and works well in practice.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241
import numpy as npfrom sklearn.metrics import accuracy_score, log_lossfrom typing import List, Dict, Tuple, Any class EnsembleSelection: """ Ensemble Selection algorithm from Caruana et al. Greedily builds an ensemble by iteratively adding the model that most improves ensemble performance on validation data. """ def __init__(self, ensemble_size: int = 50, metric: str = 'accuracy'): """ Parameters: ensemble_size: Maximum models in ensemble (with replacement) metric: 'accuracy' or 'log_loss' """ self.ensemble_size = ensemble_size self.metric = metric def fit(self, predictions: Dict[str, np.ndarray], y_true: np.ndarray, model_objects: Dict[str, Any] = None) -> 'EnsembleSelection': """ Build the ensemble. Parameters: predictions: Dict mapping model names to validation predictions Shape: (n_samples,) for class predictions or (n_samples, n_classes) for probabilities y_true: True validation labels model_objects: Actual fitted model objects for prediction Returns: self """ self.model_names = list(predictions.keys()) self.model_objects = model_objects or {} # Store predictions self.predictions = predictions self.y_true = y_true # Convert to probability format if needed self.probas = {} n_classes = len(np.unique(y_true)) for name, pred in predictions.items(): if pred.ndim == 1: # Convert class predictions to one-hot probabilities proba = np.zeros((len(pred), n_classes)) for i, p in enumerate(pred): proba[i, int(p)] = 1.0 self.probas[name] = proba else: self.probas[name] = pred # Greedy selection self.ensemble_members = [] # List of model names (with possible repeats) self.weights = {} # Computed after selection best_score = float('-inf') if self.metric == 'accuracy' else float('inf') for iteration in range(self.ensemble_size): best_candidate = None best_candidate_score = best_score for name in self.model_names: # Tentatively add this model trial_members = self.ensemble_members + [name] # Compute ensemble prediction ensemble_proba = self._compute_ensemble_proba(trial_members) score = self._evaluate(ensemble_proba, y_true) # Check if this is an improvement if self.metric == 'accuracy': is_better = score > best_candidate_score else: # log_loss (lower is better) is_better = score < best_candidate_score if is_better: best_candidate = name best_candidate_score = score # Add best candidate if it improves ensemble if best_candidate is not None: if self.metric == 'accuracy': is_improvement = best_candidate_score > best_score else: is_improvement = best_candidate_score < best_score if is_improvement or len(self.ensemble_members) < 3: self.ensemble_members.append(best_candidate) best_score = best_candidate_score else: break # No improvement, stop # Compute final weights self._compute_weights() return self def _compute_ensemble_proba(self, members: List[str]) -> np.ndarray: """Average probabilities over ensemble members.""" if not members: # Return uniform if empty n_samples = len(self.y_true) n_classes = list(self.probas.values())[0].shape[1] return np.ones((n_samples, n_classes)) / n_classes proba_sum = np.zeros_like(list(self.probas.values())[0]) for member in members: proba_sum += self.probas[member] return proba_sum / len(members) def _evaluate(self, proba: np.ndarray, y_true: np.ndarray) -> float: """Evaluate predictions.""" if self.metric == 'accuracy': predictions = np.argmax(proba, axis=1) return accuracy_score(y_true, predictions) else: return log_loss(y_true, proba) def _compute_weights(self): """Compute effective weights for each unique model.""" self.weights = {} total = len(self.ensemble_members) for name in set(self.ensemble_members): count = self.ensemble_members.count(name) self.weights[name] = count / total def predict(self, X: np.ndarray) -> np.ndarray: """ Make predictions with the ensemble. Parameters: X: Features to predict Returns: Predicted class labels """ proba = self.predict_proba(X) return np.argmax(proba, axis=1) def predict_proba(self, X: np.ndarray) -> np.ndarray: """ Get probability predictions from ensemble. """ if not self.model_objects: raise ValueError("No model objects available for prediction!") # Weighted average of predictions weighted_proba = None for name, weight in self.weights.items(): if name not in self.model_objects: continue model = self.model_objects[name] if hasattr(model, 'predict_proba'): proba = model.predict_proba(X) else: # Convert to one-hot pred = model.predict(X) n_classes = len(np.unique(self.y_true)) proba = np.zeros((len(pred), n_classes)) for i, p in enumerate(pred): proba[i, int(p)] = 1.0 if weighted_proba is None: weighted_proba = weight * proba else: weighted_proba += weight * proba return weighted_proba def summary(self) -> str: """Return a summary of the ensemble.""" lines = ["Ensemble Summary:"] lines.append(f" Total members: {len(self.ensemble_members)}") lines.append(f" Unique models: {len(self.weights)}") lines.append(" Weights:") for name, weight in sorted(self.weights.items(), key=lambda x: -x[1]): lines.append(f" {name}: {weight:.3f}") return "".join(lines) def build_stacked_ensemble(base_models: Dict[str, Any], X_train: np.ndarray, y_train: np.ndarray, X_val: np.ndarray, y_val: np.ndarray, meta_model_factory = None): """ Build a stacked ensemble (two-level). Level 0: Base models make predictions Level 1: Meta-model learns to combine base predictions """ from sklearn.linear_model import LogisticRegression from sklearn.model_selection import cross_val_predict if meta_model_factory is None: meta_model_factory = lambda: LogisticRegression(max_iter=200) # Get out-of-fold predictions for stacking base_predictions = {} fitted_models = {} for name, model in base_models.items(): # Clone model from sklearn.base import clone model_clone = clone(model) # Get OOF predictions try: oof_pred = cross_val_predict( model_clone, X_train, y_train, cv=5, method='predict_proba' ) base_predictions[name] = oof_pred # Fit on full training data model_clone = clone(model) model_clone.fit(X_train, y_train) fitted_models[name] = model_clone except Exception as e: print(f"Skipping {name}: {e}") # Create meta-features meta_X_train = np.hstack(list(base_predictions.values())) # Fit meta-model meta_model = meta_model_factory() meta_model.fit(meta_X_train, y_train) return fitted_models, meta_modelLet's examine how production AutoML systems implement portfolio methods:
Auto-sklearn's Approach:
Key insight: Auto-sklearn never just returns one model—it always ensembles.
H2O AutoML's Approach:
Key insight: H2O emphasizes robustness—run many models, stack them all.
Google Cloud AutoML (Tables):
Key insight: Combine modern (neural) with robust (boosting) approaches.
| System | Portfolio Approach | Ensemble Method | Key Strength |
|---|---|---|---|
| Auto-sklearn | Meta-learning + SMAC | Caruana selection | Sample efficiency |
| H2O AutoML | Fixed progression | Stacking | Robustness |
| TPOT | Genetic evolution | Pipeline optimization | Flexibility |
| AutoGluon | Multi-layer stacking | Repeated stacking | Performance |
| Google AutoML | NAS + boosting | Weighted ensemble | Scalability |
AutoGluon takes stacking to extreme: multiple layers of stacking with repeated k-fold cross-validation. Each layer's predictions become features for the next layer. This achieves state-of-the-art performance on tabular data benchmarks but requires significant computation.
We've explored portfolio methods as an alternative to single-algorithm selection. Let's consolidate the key takeaways:
Module Complete:
This concludes Module 3: Automated Model Selection. You've learned:
These concepts form the foundation of modern AutoML systems, enabling them to efficiently find high-performing machine learning solutions with minimal human intervention.
You now have a comprehensive understanding of automated model selection—from theoretical foundations (Rice's framework, CASH) through practical implementations (meta-learning, warm-starting, portfolios). This knowledge equips you to use AutoML systems effectively, understand their decisions, and extend them for specialized applications.