Loading content...
One of the most effective strategies for training deep neural networks is dynamic learning rate adjustment. Rather than using a fixed learning rate throughout training, practitioners often employ learning rate schedulers that systematically reduce the learning rate as training progresses, allowing for larger initial steps during early exploration and finer adjustments near convergence.
The Epoch-Based Decay Scheduler implements a stepped decay strategy where the learning rate remains constant for a specified number of epochs, then drops by a multiplicative factor. This pattern creates a "staircase" effect in the learning rate schedule, providing predictable decay intervals that align naturally with training epochs.
Mathematical Formulation:
Given an initial learning rate η₀, a step size s (in epochs), and a decay factor γ (gamma, where 0 < γ < 1), the learning rate at epoch e is computed as:
$$\eta_e = \eta_0 \times \gamma^{\lfloor e / s \rfloor}$$
Where ⌊e / s⌋ represents the floor division of the current epoch by the step size, determining how many decay steps have occurred.
Understanding the Decay Pattern:
Your Task:
Implement a Python class EpochDecayScheduler with:
__init__ method that accepts initial_lr (float), step_size (int), and gamma (float)get_lr(epoch) method that returns the learning rate for any given epoch, rounded to 4 decimal placesUse only standard Python libraries for this implementation.
scheduler = EpochDecayScheduler(initial_lr=0.1, step_size=5, gamma=0.5)
scheduler.get_lr(epoch=0)0.1At epoch 0, no decay has occurred yet. The floor division ⌊0/5⌋ = 0, so the learning rate is 0.1 × 0.5⁰ = 0.1 × 1 = 0.1. The initial learning rate is returned unchanged.
scheduler = EpochDecayScheduler(initial_lr=0.1, step_size=5, gamma=0.5)
scheduler.get_lr(epoch=5)0.05At epoch 5, the first decay step triggers. The floor division ⌊5/5⌋ = 1, so the learning rate is 0.1 × 0.5¹ = 0.1 × 0.5 = 0.05. The learning rate has been halved once.
scheduler = EpochDecayScheduler(initial_lr=0.1, step_size=5, gamma=0.5)
scheduler.get_lr(epoch=10)0.025At epoch 10, two decay steps have occurred. The floor division ⌊10/5⌋ = 2, so the learning rate is 0.1 × 0.5² = 0.1 × 0.25 = 0.025. The learning rate has been halved twice from the initial value.
Constraints