Loading problem...
The Parametric Rectified Linear Unit (PReLU) is an advanced activation function used extensively in deep learning architectures. It builds upon the standard Rectified Linear Unit (ReLU) by introducing a learnable slope parameter α (alpha) that controls the behavior for negative input values, addressing the "dying ReLU" problem where neurons can become permanently inactive.
The PReLU activation function is defined piecewise as:
$$\text{PReLU}(x) = \begin{cases} x & \text{if } x > 0 \ \alpha \cdot x & \text{if } x \leq 0 \end{cases}$$
Where:
This can also be expressed compactly as:
$$\text{PReLU}(x) = \max(0, x) + \alpha \cdot \min(0, x)$$
For positive inputs (x > 0): The function behaves identically to ReLU, returning the input unchanged. This preserves the gradient flow for positive activations.
For negative inputs (x ≤ 0): Instead of outputting zero (as ReLU does), PReLU multiplies the input by the parameter α. This creates a scaled linear response in the negative domain, allowing gradients to propagate through the network even for negative activations.
The alpha parameter is typically:
When α = 0, PReLU reduces to standard ReLU. When α = 1, it becomes the identity function. When α < 0, it creates a non-monotonic function (rarely used).
Implement a function that computes the PReLU activation value for a given input value x and alpha parameter. The function should correctly handle the piecewise nature of the activation, applying the identity function for positive inputs and the scaled linear function for non-positive inputs.
x = -2.0, alpha = 0.25-0.5Since x = -2.0 is negative (x ≤ 0), we apply the negative branch of the PReLU function:
PReLU(x) = α × x = 0.25 × (-2.0) = -0.5
The negative input is scaled by alpha, allowing a small gradient to flow through during backpropagation, which helps prevent dead neurons.
x = 3.0, alpha = 0.253Since x = 3.0 is positive (x > 0), we apply the positive branch of the PReLU function:
PReLU(x) = x = 3.0
For positive inputs, PReLU behaves identically to ReLU, returning the input unchanged. The alpha parameter has no effect when x > 0.
x = 0.0, alpha = 0.250At the boundary when x = 0, we apply the non-positive branch:
PReLU(x) = α × x = 0.25 × 0.0 = 0.0
Note that both branches technically give the same result at x = 0, making the function continuous. This ensures smooth gradient flow at the origin.
Constraints