Loading content...
In deep learning and optimization, the learning rate is one of the most critical hyperparameters that determines how quickly or slowly a model learns. A constant learning rate throughout training is often suboptimal—starting with a high learning rate enables rapid initial progress, but as the model approaches convergence, smaller steps are needed for fine-tuning.
Exponential decay is a widely-used technique for dynamically adjusting the learning rate during training. With this strategy, the learning rate decreases by a constant multiplicative factor after each epoch, creating a smooth exponential curve that naturally balances exploration and exploitation throughout the training process.
The learning rate at any epoch t is computed using the formula:
$$lr_t = lr_0 \times \gamma^t$$
Where:
Your Task: Implement a Python class that serves as an exponential decay learning rate scheduler. The class should:
Key Insight: When γ = 0.9 and the initial learning rate is 0.1:
This gradual decay helps the optimizer take large steps initially to escape local minima and smaller steps later for precise convergence.
initial_lr = 0.1, decay_factor = 0.9, epoch = 00.1At epoch 0, the learning rate equals the initial learning rate.
Calculation: lr = 0.1 × 0.9⁰ = 0.1 × 1 = 0.1
The decay factor has no effect at epoch 0 since any number raised to the power of 0 equals 1.
initial_lr = 0.1, decay_factor = 0.9, epoch = 10.09After the first epoch, the learning rate is reduced by the decay factor.
Calculation: lr = 0.1 × 0.9¹ = 0.1 × 0.9 = 0.09
The learning rate has decreased by 10% from its initial value.
initial_lr = 0.1, decay_factor = 0.9, epoch = 30.0729After three epochs, the learning rate has been reduced three times by the decay factor.
Calculation: lr = 0.1 × 0.9³ = 0.1 × 0.729 = 0.0729
The compounding effect of exponential decay is evident: the rate is now 72.9% of the original value.
Constraints