Loading learning content...
The search space is the universe of configurations that an AutoML system can explore. It defines the boundaries of what's achievable—configurations outside the search space can never be discovered, no matter how effective the search strategy.
Search space design is a critical skill that separates effective AutoML from wasted computation. Too narrow a space may exclude the optimal solution; too broad wastes resources exploring irrelevant configurations.
This page covers the theory and practice of search space design: parameter types, space structure, conditioning, and the art of balancing expressiveness with tractability.
You'll master continuous, discrete, categorical, and conditional parameter spaces. You'll understand hierarchical structures, learn to handle constraints, and develop intuition for designing spaces that are both expressive and searchable.
A search space $\mathcal{X}$ is a set of possible configurations. Each configuration $x \in \mathcal{X}$ specifies values for all hyperparameters:
$$x = (x_1, x_2, ..., x_n)$$
where each $x_i$ is drawn from its corresponding domain $\mathcal{X}_i$.
Parameter Types:
Different parameters have fundamentally different characteristics that affect how they should be searched:
| Type | Domain | Examples | Search Considerations |
|---|---|---|---|
| Continuous | Real interval [a, b] | Learning rate, regularization strength | Gradient-based or model-based optimization |
| Integer | Discrete range {a, ..., b} | Hidden layer size, n_estimators | Can quantize continuous or use integer-aware methods |
| Categorical | Unordered set {a, b, c} | Optimizer type, kernel type | No metric; requires enumeration or encoding |
| Ordinal | Ordered set {low, med, high} | Model complexity levels | Has order but not necessarily metric |
| Boolean | Binary {True, False} | Use dropout?, early stopping? | Special case of categorical with 2 values |
Scale Matters:
Many parameters are best searched on logarithmic scale. Learning rates, regularization strengths, and other multiplicative factors vary over orders of magnitude:
Searching uniformly in linear space wastes most samples in the upper ranges. Log-scale sampling distributes samples evenly across magnitudes:
1
Use log scale when: (1) the parameter spans multiple orders of magnitude, (2) the effect is multiplicative rather than additive, (3) small values need fine granularity. Common log-scale parameters: learning rate, regularization, dropout, weight decay, kernel bandwidth.
Real ML configurations have conditional structure: some parameters only matter when others take specific values. This creates a hierarchical, tree-structured search space.
Example: Neural Network Configuration
Consider configuring a neural network optimizer:
Naively treating all parameters as independent wastes search effort on meaningless combinations.
1
Why Conditional Spaces Matter:
Reduced Effective Dimensionality: A 20-parameter space with conditions may have only 8-10 active parameters for any given configuration.
Meaningful Configurations Only: Prevents wasting evaluations on nonsensical combinations.
Better Surrogate Models: Optimization algorithms can learn structure rather than treating inactive parameters as noise.
Hardware/Memory Constraints: Can encode constraints like "if model_size = large, then batch_size ≤ 32".
When a conditional parameter is inactive, search algorithms must handle it carefully. Common approaches: (1) treat as missing/imputed value, (2) use special 'inactive' marker, (3) use separate surrogate models per parent value. ConfigSpace and similar libraries handle this automatically.
Combined Algorithm Selection and Hyperparameter optimization (CASH) unifies algorithm choice with parameter tuning into a single hierarchical search space.
Structure:
Root: algorithm ∈ {RandomForest, GradientBoosting, SVM, NeuralNet, ...}
│
├── If RandomForest:
│ ├── n_estimators ∈ [50, 500]
│ ├── max_depth ∈ [None, 5, 10, 20, 50]
│ └── min_samples_split ∈ [2, 20]
│
├── If GradientBoosting:
│ ├── n_estimators ∈ [50, 500]
│ ├── learning_rate ∈ [0.001, 0.3] (log)
│ └── max_depth ∈ [3, 10]
│
├── If SVM:
│ ├── C ∈ [0.001, 1000] (log)
│ ├── kernel ∈ {linear, rbf, poly}
│ └── If kernel = rbf: gamma ∈ [1e-5, 10] (log)
│
└── If NeuralNet:
├── n_layers ∈ [1, 5]
├── hidden_size ∈ [16, 512]
└── activation ∈ {relu, tanh, elu}
This hierarchical structure naturally encodes that different algorithms have different hyperparameters.
1
CASH spaces can be enormous. With 5 algorithms averaging 100,000 hyperparameter combinations each, plus preprocessing options (120 combinations), the total space exceeds 50 million configurations. Efficient search strategies are essential—exhaustive evaluation is impossible.
NAS search spaces define the universe of possible neural network architectures. The design of this space profoundly affects both the quality of discovered architectures and the computational cost of search.
Cell-Based Search Spaces:
Modern NAS typically searches for cells—small repeating units—rather than entire networks. A cell contains nodes connected by operations. The network is constructed by stacking cells.
Search Space Size Calculation:
For a cell with B nodes, each receiving 2 inputs from prior nodes, with K operations:
$$|\mathcal{X}| = \prod_{i=2}^{B+1} K^2 \cdot \binom{i}{2} = K^{2B} \cdot \frac{(B+1)!}{2}$$
With B=4 nodes and K=8 operations: $$|\mathcal{X}| = 8^8 \cdot 60 \approx 10^9 \text{ architectures}$$
Exhaustive search is clearly impossible.
Constraining the Space:
Effective NAS requires balancing expressiveness with tractability:
The search space encodes architectural priors. Cell-based spaces assume repeating structure. Operation sets encode inductive biases (conv for vision, attention for sequences). Good search space design incorporates domain knowledge to narrow search to promising regions.
Real-world AutoML must optimize not just accuracy but multiple objectives subject to constraints.
Common Objectives:
Common Constraints:
| Approach | Mechanism | Pros | Cons |
|---|---|---|---|
| Constraint as Penalty | Add constraint violation to objective | Simple, works with single-objective solvers | Requires tuning penalty weight |
| Feasibility Filtering | Reject infeasible configurations | Clean search space | May reject good approximate solutions |
| Pareto Optimization | Find Pareto front of non-dominated solutions | Returns full tradeoff surface | More complex, harder to automate selection |
| Scalarization | Weighted sum of objectives | Reduces to single-objective | May miss concave Pareto regions |
| Lexicographic | Optimize objectives in priority order | Clear priorities | Ignores tradeoffs between lower-priority objectives |
1
Modern NAS increasingly incorporates hardware constraints directly. Systems like Once-For-All train a supernet once, then extract subnets meeting specific constraints (latency on iPhone, memory on IoT device). This amortizes search cost across many deployment targets.
Effective search space design balances competing concerns. These principles guide the process:
Common Mistakes:
| Mistake | Consequence | Fix |
|---|---|---|
| Too narrow ranges | Miss optimal regions | Analyze sensitivity, expand |
| Linear scale for log-scale params | Waste samples in one region | Use log-uniform |
| Ignoring conditionals | Evaluate meaningless configs | Model hierarchical structure |
| Including irrelevant params | Curse of dimensionality | Remove parameters with no effect |
| Too many categorical choices | Combinatorial explosion | Group similar, limit options |
Search space design is iterative. Run initial search with broad ranges, analyze which regions perform well, then define focused space for deeper search. This 'zoom-in' pattern efficiently allocates budget.
What's Next:
With the search space defined, we need search strategies to explore it efficiently. The next page covers methods from random search through Bayesian optimization to evolutionary algorithms—the engines that power AutoML exploration.
You now understand how to define AutoML search spaces: parameter types, scaling, conditional structure, CASH formulation, NAS spaces, and design principles. Next, we'll explore the search strategies that navigate these spaces efficiently.