Loading problem...
The Exponential Linear Unit (ELU) is an advanced activation function designed to address several limitations found in traditional rectified activation functions. While standard rectified activations completely zero out negative inputs, ELU provides a smooth, non-zero response for negative values, which helps neural networks learn more effectively.
The ELU activation function is defined piecewise as follows:
$$f(x) = \begin{cases} x & \text{if } x > 0 \ \alpha \cdot (e^x - 1) & \text{if } x \leq 0 \end{cases}$$
Where:
1. Smooth Negative Response: Unlike rectified linear activations that produce hard zeros for negative inputs, ELU outputs a smooth exponential curve that asymptotically approaches -α as x goes to negative infinity. This provides gradient flow even for negative inputs.
2. Zero-Centered Outputs: For negative inputs, ELU produces negative outputs, making the activations more mean-centered around zero. This property can accelerate the learning process by reducing the bias shift effect.
3. Continuous and Differentiable: The function is continuous everywhere and nearly differentiable (except for a single point at x=0 where it is still continuous). The derivative smoothly transitions between the positive and negative regions.
4. The α Hyperparameter: The alpha parameter determines the minimum value the function can output (approaching -α for very negative inputs). Higher α values allow for larger negative outputs, which can help with the zero-centering effect.
Implement a function that computes the ELU activation for a given input value. The function should:
x = -1.0-0.6321Since x = -1.0 is less than or equal to 0, we use the exponential formula:
f(-1) = α × (e^(-1) - 1) = 1.0 × (0.3679 - 1) = 1.0 × (-0.6321) = -0.6321
The exponential term e^(-1) ≈ 0.3679, and subtracting 1 gives us the negative activation value.
x = 2.02.0Since x = 2.0 is greater than 0, the function simply returns x unchanged:
f(2.0) = 2.0
For positive inputs, ELU behaves identically to the identity function, preserving the linear gradient flow.
x = 0.00.0At the boundary x = 0, we apply the negative branch formula:
f(0) = α × (e^0 - 1) = 1.0 × (1 - 1) = 1.0 × 0 = 0.0
This demonstrates the continuity of ELU at x = 0, where both branches meet seamlessly.
Constraints