Loading content...
Imagine you've solved hundreds of machine learning problems over your career. Each time you face a new dataset, you don't start from scratch—you draw on your accumulated experience. "This looks like a high-dimensional sparse dataset... regularized linear models usually work well here." "The classes are heavily imbalanced... I should try cost-sensitive learning or sampling techniques."
This is meta-learning in action: learning about learning. But humans accumulate this knowledge slowly, over years. Can machines do it systematically and at scale?
Meta-learning for algorithm selection is precisely this: training systems to predict which learning algorithms will perform best on new datasets, based on patterns learned from thousands of prior experiments. It's the key to making CASH practical—instead of searching from scratch, we start with informed guesses from meta-learning.
Meta-learning powers the warm-starting mechanisms in Auto-sklearn, dramatically reduces the optimization budget needed for good results, and embodies the principle that past experience should accelerate future learning.
This page explores how meta-learning works for algorithm selection: the data it needs, the models it builds, and how it's integrated into production AutoML systems.
By the end of this page, you will understand how to build meta-learning systems for algorithm selection, the role of meta-databases, approaches like instance-based and model-based meta-learning, theoretical foundations of transfer across tasks, and practical implementations used in state-of-the-art AutoML.
Meta-learning for algorithm selection operates at a level above individual learning tasks. Instead of learning to predict labels from features, we learn to predict algorithm performance from dataset characteristics.
The Meta-Learning Setup:
Base-level (Object-level):
Meta-level:
The meta-learner M is trained on historical data: (meta-features, algorithm, performance) triplets from past experiments. Once trained, it predicts which algorithm will work best on a new dataset based only on that dataset's meta-features.
Formal Definition:
Given:
Train a meta-model: ĝ(A, f(D)) ≈ p(A, D) = E[performance of A on D]
For a new dataset D', predict: A = argmax_A ĝ(A, f(D'))*
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105
import numpy as npfrom typing import Dict, List, Tuple, Anyfrom sklearn.preprocessing import StandardScalerfrom dataclasses import dataclass @dataclassclass MetaExperiment: """A single experiment in the meta-database.""" dataset_id: str meta_features: np.ndarray algorithm_name: str hyperparameters: Dict[str, Any] performance: float training_time: float class MetaDatabase: """ Meta-database storing historical algorithm performance data. This is the foundation of meta-learning: a structured repository of past experiments that the meta-learner trains on. """ def __init__(self): self.experiments: List[MetaExperiment] = [] self.dataset_meta_features: Dict[str, np.ndarray] = {} self.algorithm_names: List[str] = [] def add_experiment(self, experiment: MetaExperiment): """Add a new experiment to the database.""" self.experiments.append(experiment) self.dataset_meta_features[experiment.dataset_id] = experiment.meta_features if experiment.algorithm_name not in self.algorithm_names: self.algorithm_names.append(experiment.algorithm_name) def add_batch(self, dataset_id: str, meta_features: np.ndarray, results: Dict[str, Tuple[float, float]]): """ Add a batch of results for one dataset. Parameters: dataset_id: Unique dataset identifier meta_features: Meta-features of the dataset results: Dict mapping algorithm names to (performance, time) tuples """ for algo_name, (performance, train_time) in results.items(): exp = MetaExperiment( dataset_id=dataset_id, meta_features=meta_features, algorithm_name=algo_name, hyperparameters={}, # Could be expanded performance=performance, training_time=train_time ) self.add_experiment(exp) def get_performance_matrix(self) -> Tuple[np.ndarray, List[str], List[str]]: """ Build the dataset × algorithm performance matrix. Returns: performance_matrix: (n_datasets, n_algorithms) array dataset_ids: Ordered list of dataset IDs algorithm_names: Ordered list of algorithm names """ dataset_ids = list(self.dataset_meta_features.keys()) algo_names = self.algorithm_names n_datasets = len(dataset_ids) n_algos = len(algo_names) matrix = np.full((n_datasets, n_algos), np.nan) for exp in self.experiments: d_idx = dataset_ids.index(exp.dataset_id) a_idx = algo_names.index(exp.algorithm_name) matrix[d_idx, a_idx] = exp.performance return matrix, dataset_ids, algo_names def get_meta_feature_matrix(self) -> Tuple[np.ndarray, List[str]]: """ Build the meta-feature matrix. Returns: meta_features: (n_datasets, n_meta_features) array dataset_ids: Ordered list of dataset IDs """ dataset_ids = list(self.dataset_meta_features.keys()) meta_features = np.array([ self.dataset_meta_features[did] for did in dataset_ids ]) return meta_features, dataset_ids def summary(self): """Print summary statistics.""" perf_matrix, _, _ = self.get_performance_matrix() print("=== Meta-Database Summary ===") print(f"Datasets: {len(self.dataset_meta_features)}") print(f"Algorithms: {len(self.algorithm_names)}") print(f"Total experiments: {len(self.experiments)}") print(f"Coverage: {(~np.isnan(perf_matrix)).mean()*100:.1f}%") print(f"Algorithms: {self.algorithm_names}")OpenML (openml.org) provides a massive public meta-database of experiments on thousands of datasets with dozens of algorithms. Auto-sklearn ships with meta-knowledge from 140+ OpenML datasets, enabling immediate warm-starting without building your own meta-database.
The simplest and often most effective meta-learning approach is instance-based or similarity-based: find datasets similar to the new one and recommend algorithms that worked well on those similar datasets.
The Algorithm:
Why Instance-Based Works:
Distance Metric Choices:
The distance metric significantly impacts performance:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139
import numpy as npfrom sklearn.neighbors import NearestNeighborsfrom sklearn.preprocessing import StandardScalerfrom typing import List, Dict, Tuple, Optional class InstanceBasedMetaLearner: """ Instance-based (k-NN) meta-learning for algorithm selection. This is the approach used by Auto-sklearn for warm-starting. Given a new dataset, find similar past datasets and recommend algorithms that worked well on them. """ def __init__(self, k: int = 25, metric: str = 'manhattan'): """ Parameters: k: Number of similar datasets to consider metric: Distance metric ('euclidean', 'manhattan', 'cosine') """ self.k = k self.metric = metric self.scaler = StandardScaler() self.knn = NearestNeighbors(n_neighbors=k, metric=metric) self.meta_features = None self.dataset_ids = None self.best_configs = None # Best config per dataset def fit(self, meta_database: 'MetaDatabase'): """ Build the meta-learner from historical data. Parameters: meta_database: MetaDatabase containing past experiments """ # Get meta-features meta_features, dataset_ids = meta_database.get_meta_feature_matrix() # Normalize meta-features (critical for distance-based methods) self.meta_features = self.scaler.fit_transform(meta_features) self.dataset_ids = dataset_ids # Fit k-NN self.knn.fit(self.meta_features) # Store best configuration per dataset perf_matrix, _, algo_names = meta_database.get_performance_matrix() self.algo_names = algo_names # For each dataset, find best algorithm self.best_configs = {} for i, did in enumerate(dataset_ids): if not np.all(np.isnan(perf_matrix[i])): best_algo_idx = np.nanargmax(perf_matrix[i]) self.best_configs[did] = { 'algorithm': algo_names[best_algo_idx], 'performance': perf_matrix[i, best_algo_idx] } # Also store full performance matrix for advanced queries self.perf_matrix = perf_matrix def recommend(self, new_meta_features: np.ndarray, n_recommendations: int = 5) -> List[Dict]: """ Recommend algorithms for a new dataset. Parameters: new_meta_features: Meta-features of the new dataset n_recommendations: Number of configurations to recommend Returns: List of recommended configurations with expected performance """ # Normalize mf_scaled = self.scaler.transform(new_meta_features.reshape(1, -1)) # Find k most similar datasets distances, indices = self.knn.kneighbors(mf_scaled) # Collect recommendations from similar datasets recommendations = [] seen_algorithms = set() for idx, dist in zip(indices[0], distances[0]): dataset_id = self.dataset_ids[idx] if dataset_id in self.best_configs: config = self.best_configs[dataset_id] algo = config['algorithm'] if algo not in seen_algorithms: recommendations.append({ 'algorithm': algo, 'expected_performance': config['performance'], 'source_dataset': dataset_id, 'distance': dist }) seen_algorithms.add(algo) # Sort by expected performance recommendations.sort(key=lambda x: x['expected_performance'], reverse=True) return recommendations[:n_recommendations] def get_algorithm_ranking(self, new_meta_features: np.ndarray) -> List[Tuple[str, float]]: """ Get a complete ranking of algorithms for the new dataset. Uses weighted voting from k nearest neighbors. """ # Normalize mf_scaled = self.scaler.transform(new_meta_features.reshape(1, -1)) # Find k most similar datasets distances, indices = self.knn.kneighbors(mf_scaled) # Compute distance-weighted performance per algorithm weights = 1.0 / (distances[0] + 1e-10) # Inverse distance weighting weights /= weights.sum() algo_scores = {} for algo_idx, algo_name in enumerate(self.algo_names): weighted_sum = 0 weight_sum = 0 for neighbor_idx, weight in zip(indices[0], weights): perf = self.perf_matrix[neighbor_idx, algo_idx] if not np.isnan(perf): weighted_sum += weight * perf weight_sum += weight if weight_sum > 0: algo_scores[algo_name] = weighted_sum / weight_sum # Sort by score ranking = sorted(algo_scores.items(), key=lambda x: x[1], reverse=True) return rankingAuto-sklearn's Meta-Learning Implementation:
Auto-sklearn uses the following approach:
This warm-starting typically finds near-optimal performance in the first few configurations, rather than requiring hundreds of random trials.
Instance-based meta-learning is sensitive to the quality of meta-features. If a meta-feature is noisy or irrelevant, it adds noise to distances. Feature selection on meta-features themselves (meta-feature selection!) can improve performance.
Model-based meta-learning trains explicit models to predict algorithm performance from meta-features. Instead of just comparing distances, we learn a function that generalizes patterns across the meta-database.
Approaches:
1. Algorithm Performance Regression
Train a regression model to predict:
This allows predicting performance for all algorithms on a new dataset, then selecting the best.
2. Algorithm Ranking
Train a model to predict pairwise preferences:
Aggregate pairwise predictions into a complete ranking.
3. Best Algorithm Classification
Train a classifier:
Simplest approach, but ignores performance margins and may have class imbalance.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157
import numpy as npfrom sklearn.ensemble import RandomForestRegressor, RandomForestClassifierfrom sklearn.preprocessing import StandardScaler, LabelEncoderfrom sklearn.model_selection import cross_val_scorefrom typing import List, Dict, Tuple class ModelBasedMetaLearner: """ Model-based meta-learning for algorithm selection. Trains a model to predict algorithm performance from meta-features, enabling generalization beyond the k-NN similarity approach. """ def __init__(self, model_type: str = 'regression'): """ Parameters: model_type: 'regression' (predict performance), 'classification' (predict best algorithm), 'ranking' (predict pairwise preferences) """ self.model_type = model_type self.scaler = StandardScaler() if model_type == 'regression': self.model = RandomForestRegressor(n_estimators=100, random_state=42) elif model_type == 'classification': self.model = RandomForestClassifier(n_estimators=100, random_state=42) else: raise ValueError(f"Unknown model type: {model_type}") self.algo_encoder = LabelEncoder() def fit(self, meta_database: 'MetaDatabase'): """ Train the meta-learner on historical experiments. """ meta_features, dataset_ids = meta_database.get_meta_feature_matrix() perf_matrix, _, algo_names = meta_database.get_performance_matrix() self.algo_names = algo_names self.algo_encoder.fit(algo_names) if self.model_type == 'regression': self._fit_regression(meta_features, perf_matrix) elif self.model_type == 'classification': self._fit_classification(meta_features, perf_matrix) def _fit_regression(self, meta_features: np.ndarray, perf_matrix: np.ndarray): """ Fit regression model: predict performance for (dataset, algorithm) pairs. """ X_list = [] y_list = [] n_datasets, n_algos = perf_matrix.shape for d_idx in range(n_datasets): for a_idx in range(n_algos): if not np.isnan(perf_matrix[d_idx, a_idx]): # Concatenate meta-features with algorithm encoding algo_one_hot = np.zeros(n_algos) algo_one_hot[a_idx] = 1 x = np.concatenate([meta_features[d_idx], algo_one_hot]) y = perf_matrix[d_idx, a_idx] X_list.append(x) y_list.append(y) X = np.array(X_list) y = np.array(y_list) # Scale meta-features (not algorithm encoding) n_mf = meta_features.shape[1] X[:, :n_mf] = self.scaler.fit_transform(X[:, :n_mf]) self.n_meta_features = n_mf self.model.fit(X, y) def _fit_classification(self, meta_features: np.ndarray, perf_matrix: np.ndarray): """ Fit classification model: predict best algorithm per dataset. """ # For each dataset, find best algorithm best_algos = [] valid_indices = [] for d_idx in range(len(perf_matrix)): if not np.all(np.isnan(perf_matrix[d_idx])): best_algo_idx = np.nanargmax(perf_matrix[d_idx]) best_algos.append(self.algo_names[best_algo_idx]) valid_indices.append(d_idx) X = self.scaler.fit_transform(meta_features[valid_indices]) y = self.algo_encoder.transform(best_algos) self.n_meta_features = meta_features.shape[1] self.model.fit(X, y) def predict(self, new_meta_features: np.ndarray) -> Dict[str, float]: """ Predict algorithm performances for a new dataset. Returns: Dict mapping algorithm names to predicted performances """ if self.model_type == 'regression': return self._predict_regression(new_meta_features) elif self.model_type == 'classification': return self._predict_classification(new_meta_features) def _predict_regression(self, mf: np.ndarray) -> Dict[str, float]: """Predict performance for each algorithm.""" mf_scaled = self.scaler.transform(mf.reshape(1, -1)) predictions = {} n_algos = len(self.algo_names) for a_idx, algo_name in enumerate(self.algo_names): algo_one_hot = np.zeros(n_algos) algo_one_hot[a_idx] = 1 x = np.concatenate([mf_scaled[0], algo_one_hot]).reshape(1, -1) predictions[algo_name] = self.model.predict(x)[0] return predictions def _predict_classification(self, mf: np.ndarray) -> Dict[str, float]: """Predict probability of each algorithm being best.""" mf_scaled = self.scaler.transform(mf.reshape(1, -1)) probas = self.model.predict_proba(mf_scaled)[0] predictions = {} for prob, algo_name in zip(probas, self.algo_encoder.classes_): predictions[algo_name] = prob return predictions def recommend(self, new_meta_features: np.ndarray, n_recommendations: int = 5) -> List[Tuple[str, float]]: """ Get top-n algorithm recommendations. """ predictions = self.predict(new_meta_features) ranking = sorted(predictions.items(), key=lambda x: x[1], reverse=True) return ranking[:n_recommendations] def get_feature_importance(self) -> np.ndarray: """ Get importance of meta-features in predicting algorithm performance. """ importances = self.model.feature_importances_ return importances[:self.n_meta_features] # Exclude algorithm encodingComparison: Instance-Based vs Model-Based
| Aspect | Instance-Based (k-NN) | Model-Based |
|---|---|---|
| Generalization | Local only; fails for distant datasets | Can extrapolate if patterns generalize |
| Interpretability | High; show similar datasets | Medium; show feature importances |
| Robustness | Robust to model misspecification | Can overfit with limited meta-data |
| Scalability | O(n) per query without indexing | O(1) per query after training |
| New algorithms | Requires new experiments | Requires retraining (or few-shot transfer) |
Production systems often combine both: use instance-based for warm-starting (fast, robust), then use model-based predictions to guide exploration in regions where similar datasets haven't been explored.
Meta-learning for algorithm selection is fundamentally about transfer learning across datasets. Knowledge gained from one dataset should accelerate learning on another. Let's formalize this.
The Transfer Learning Perspective:
Define a distribution P(D) over datasets. Each dataset D induces a performance function:
If datasets are drawn from the same distribution P(D), their performance functions f_D share structure. For example:
Meta-learning captures this shared structure through:
Multi-Task Learning Formulation:
Treat algorithm performance prediction on each dataset as a separate task. Use multi-task learning to share information:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145
import numpy as npimport torchimport torch.nn as nnfrom torch.utils.data import DataLoader, TensorDatasetfrom typing import Tuple, Dict, List class MultiTaskMetaLearner(nn.Module): """ Multi-task neural network for algorithm performance prediction. Uses shared layers to capture common patterns in how meta-features relate to algorithm performance, with task-specific output heads. """ def __init__(self, n_meta_features: int, n_algorithms: int, hidden_dims: List[int] = [128, 64]): super().__init__() self.n_algorithms = n_algorithms # Shared layers (capture common patterns) layers = [] prev_dim = n_meta_features for hidden_dim in hidden_dims: layers.extend([ nn.Linear(prev_dim, hidden_dim), nn.ReLU(), nn.Dropout(0.2) ]) prev_dim = hidden_dim self.shared = nn.Sequential(*layers) # Algorithm-specific heads (capture algorithm-specific patterns) self.heads = nn.ModuleList([ nn.Sequential( nn.Linear(hidden_dims[-1], 32), nn.ReLU(), nn.Linear(32, 1) ) for _ in range(n_algorithms) ]) def forward(self, meta_features: torch.Tensor) -> torch.Tensor: """ Forward pass: predict performance for all algorithms. Parameters: meta_features: (batch_size, n_meta_features) tensor Returns: (batch_size, n_algorithms) performance predictions """ shared_repr = self.shared(meta_features) predictions = [] for head in self.heads: pred = head(shared_repr) predictions.append(pred) return torch.cat(predictions, dim=1) def predict_single(self, meta_features: torch.Tensor, algo_idx: int) -> torch.Tensor: """Predict for a specific algorithm.""" shared_repr = self.shared(meta_features) return self.heads[algo_idx](shared_repr) def train_multi_task_metalearner(meta_features: np.ndarray, performance_matrix: np.ndarray, n_epochs: int = 100, lr: float = 1e-3) -> MultiTaskMetaLearner: """ Train the multi-task meta-learner. Parameters: meta_features: (n_datasets, n_meta_features) array performance_matrix: (n_datasets, n_algorithms) array with NaNs for missing n_epochs: Training epochs lr: Learning rate Returns: Trained model """ n_datasets, n_mf = meta_features.shape n_algos = performance_matrix.shape[1] # Create model model = MultiTaskMetaLearner(n_mf, n_algos) optimizer = torch.optim.Adam(model.parameters(), lr=lr) # Convert to tensors X = torch.tensor(meta_features, dtype=torch.float32) Y = torch.tensor(performance_matrix, dtype=torch.float32) mask = ~torch.isnan(Y) # Which entries are valid # Replace NaN with 0 for computation (masked out anyway) Y = torch.nan_to_num(Y, nan=0.0) model.train() for epoch in range(n_epochs): optimizer.zero_grad() predictions = model(X) # Masked MSE loss - only compute loss for observed entries loss = ((predictions - Y) ** 2 * mask).sum() / mask.sum() loss.backward() optimizer.step() if (epoch + 1) % 20 == 0: print(f"Epoch {epoch+1}/{n_epochs}, Loss: {loss.item():.4f}") model.eval() return model def recommend_with_uncertainty(model: MultiTaskMetaLearner, meta_features: np.ndarray, n_samples: int = 50) -> Dict[int, Tuple[float, float]]: """ Get recommendations with uncertainty estimates using MC dropout. Returns: Dict mapping algorithm index to (mean prediction, std) tuples """ model.train() # Enable dropout X = torch.tensor(meta_features, dtype=torch.float32).unsqueeze(0) predictions = [] for _ in range(n_samples): with torch.no_grad(): pred = model(X) predictions.append(pred.numpy()) predictions = np.stack(predictions) results = {} for algo_idx in range(model.n_algorithms): algo_preds = predictions[:, 0, algo_idx] results[algo_idx] = (algo_preds.mean(), algo_preds.std()) model.eval() return resultsTheoretical Foundations:
When Does Meta-Learning Help?
Meta-learning provides benefit when:
Negative Transfer Risk:
Meta-learning can hurt when:
Theoretical Bounds:
Meta-learning sample complexity is related to:
Meta-learning for new algorithms is challenging—if a new algorithm isn't in the meta-database, we can't recommend it. Solutions include: algorithm meta-features (characterize algorithms themselves), multi-fidelity evaluation (cheap trials of new algorithms), and active meta-learning (strategically evaluate new algorithms on diverse datasets).
A meta-learning system is only as good as its meta-database. Let's discuss how to build and maintain effective meta-databases.
Data Collection Strategies:
1. Exhaustive Evaluation
Run all algorithms on all datasets:
2. Active Meta-Learning
Strategically select which (dataset, algorithm) pairs to evaluate:
3. Incremental Collection
Add experiments opportunistically:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174
import numpy as npfrom sklearn.model_selection import cross_val_scorefrom sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifierfrom sklearn.svm import SVCfrom sklearn.linear_model import LogisticRegressionfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.neural_network import MLPClassifierfrom typing import Dict, Callable, Listimport time class MetaDatabaseBuilder: """ Systematic meta-database construction. Evaluates multiple algorithms on multiple datasets to build a comprehensive meta-database for meta-learning. """ def __init__(self, algorithms: Dict[str, Callable] = None): """ Parameters: algorithms: Dict mapping names to (estimator_factory, param_grid) """ if algorithms is None: # Default algorithm portfolio self.algorithms = { 'RandomForest': (lambda: RandomForestClassifier(n_estimators=100, random_state=42), {}), 'GradientBoosting': (lambda: GradientBoostingClassifier(random_state=42), {}), 'SVM_RBF': (lambda: SVC(kernel='rbf', random_state=42), {}), 'SVM_Linear': (lambda: SVC(kernel='linear', random_state=42), {}), 'LogisticRegression': (lambda: LogisticRegression(max_iter=200, random_state=42), {}), 'KNN': (lambda: KNeighborsClassifier(), {}), 'NeuralNetwork': (lambda: MLPClassifier(max_iter=500, random_state=42), {}), } else: self.algorithms = algorithms self.meta_database = MetaDatabase() self.meta_feature_extractor = None # Set this def set_meta_feature_extractor(self, extractor: Callable): """Set the function that extracts meta-features from datasets.""" self.meta_feature_extractor = extractor def evaluate_dataset(self, X: np.ndarray, y: np.ndarray, dataset_id: str, cv: int = 5, verbose: bool = True) -> Dict[str, float]: """ Evaluate all algorithms on a single dataset. Parameters: X: Feature matrix y: Target vector dataset_id: Unique identifier for this dataset cv: Cross-validation folds verbose: Print progress Returns: Dict mapping algorithm names to CV accuracy """ if self.meta_feature_extractor is None: raise ValueError("Set meta_feature_extractor first!") # Extract meta-features meta_features = self.meta_feature_extractor(X, y) # Evaluate each algorithm results = {} for algo_name, (estimator_factory, _) in self.algorithms.items(): try: start_time = time.time() estimator = estimator_factory() scores = cross_val_score(estimator, X, y, cv=cv, scoring='accuracy') elapsed = time.time() - start_time results[algo_name] = (scores.mean(), elapsed) if verbose: print(f" {algo_name}: {scores.mean():.4f} (±{scores.std():.4f}) " f"in {elapsed:.1f}s") except Exception as e: if verbose: print(f" {algo_name}: FAILED ({e})") results[algo_name] = (np.nan, np.nan) # Add to meta-database self.meta_database.add_batch(dataset_id, meta_features, results) return {k: v[0] for k, v in results.items()} def evaluate_multiple_datasets(self, datasets: List[Dict], verbose: bool = True): """ Evaluate algorithms across multiple datasets. Parameters: datasets: List of {'id': str, 'X': array, 'y': array} dicts verbose: Print progress """ for i, dataset in enumerate(datasets): if verbose: print(f"Dataset {i+1}/{len(datasets)}: {dataset['id']}") self.evaluate_dataset( X=dataset['X'], y=dataset['y'], dataset_id=dataset['id'], verbose=verbose ) if verbose: print("=== Evaluation Complete ===") self.meta_database.summary() # Example: Build meta-database from OpenML datasetsdef build_from_openml(n_datasets: int = 50): """ Build a meta-database from OpenML benchmark datasets. """ import openml from openml.tasks import TaskType # Get classification tasks tasks = openml.tasks.list_tasks( task_type=TaskType.SUPERVISED_CLASSIFICATION, output_format='dataframe' ) # Filter sensible datasets tasks = tasks[ (tasks['NumberOfInstances'] >= 100) & (tasks['NumberOfInstances'] <= 10000) & (tasks['NumberOfFeatures'] <= 100) ].head(n_datasets) builder = MetaDatabaseBuilder() # Define meta-feature extractor (simplified) def extract_mf(X, y): return np.array([ X.shape[0], # n_samples X.shape[1], # n_features len(np.unique(y)), # n_classes X.shape[1] / X.shape[0], # dimensionality ratio np.std(X), # feature std ]) builder.set_meta_feature_extractor(extract_mf) datasets = [] for task_id in tasks['tid']: try: task = openml.tasks.get_task(task_id) X, y = task.get_X_and_y() # Handle missing values X = np.nan_to_num(X) datasets.append({ 'id': f"openml_{task_id}", 'X': X, 'y': y }) except Exception as e: print(f"Skipping task {task_id}: {e}") builder.evaluate_multiple_datasets(datasets) return builder.meta_databaseMeta-Database Quality Metrics:
Maintenance Challenges:
We've explored how meta-learning enables intelligent algorithm selection by learning from past experiments. Let's consolidate the key takeaways:
What's Next:
Meta-learning provides intelligent initialization for CASH optimization. The next page explores warm starting—how to leverage meta-learning predictions to dramatically accelerate hyperparameter optimization by starting from promising configurations rather than random initialization.
You now understand how meta-learning enables efficient algorithm selection by transferring knowledge from past experiments. This capability is what allows AutoML systems to find good configurations in minutes rather than hours—by starting from informed rather than random positions.