Loading content...
In machine learning and deep learning, transforming raw model outputs (known as logits) into interpretable probabilities is a fundamental operation. The softmax function is a widely used technique that converts a vector of arbitrary real values into a probability distribution where all values are positive and sum to one.
The log-softmax function takes this a step further by computing the natural logarithm of these softmax probabilities. This transformation is particularly valuable because:
Mathematical Definition:
For an input vector z of length n, the log-softmax of each element is defined as:
$$\text{LogSoftmax}(z_i) = \ln\left(\frac{e^{z_i}}{\sum_{j=1}^{n} e^{z_j}}\right) = z_i - \ln\left(\sum_{j=1}^{n} e^{z_j}\right)$$
This formulation shows that log-softmax equals the original value minus the log of the sum of all exponentials, which is often called the log-sum-exp term.
Numerical Stability Consideration:
When input values are large, computing $e^{z_i}$ directly can cause numerical overflow. A common stabilization technique involves subtracting the maximum value from all inputs before applying the exponential:
$$\text{LogSoftmax}(z_i) = (z_i - z_{\max}) - \ln\left(\sum_{j=1}^{n} e^{z_j - z_{\max}}\right)$$
Your Task: Write a Python function that computes the log-softmax transformation for a given 1D array of scores. Return the results rounded to 4 decimal places.
scores = [1, 2, 3][-2.4076, -1.4076, -0.4076]For input [1, 2, 3], we first compute the exponentials: e¹ ≈ 2.718, e² ≈ 7.389, e³ ≈ 20.086.
The sum of exponentials is: 2.718 + 7.389 + 20.086 ≈ 30.193
The log of this sum is: ln(30.193) ≈ 3.4076
For each element, log-softmax = value - log(sum): • Element 0: 1 - 3.4076 = -2.4076 • Element 1: 2 - 3.4076 = -1.4076 • Element 2: 3 - 3.4076 = -0.4076
Notice that the highest score (3) produces the largest log-softmax value (-0.4076), indicating the highest probability.
scores = [0, 1][-1.3133, -0.3133]For input [0, 1], we compute: • e⁰ = 1.0 • e¹ ≈ 2.718
Sum of exponentials ≈ 3.718 Log of sum ≈ ln(3.718) ≈ 1.3133
Log-softmax values: • Element 0: 0 - 1.3133 = -1.3133 • Element 1: 1 - 1.3133 = -0.3133
The difference between the values (1.0) is preserved in the log-softmax output, which is a key property of this transformation.
scores = [1, 1, 1, 1][-1.3863, -1.3863, -1.3863, -1.3863]When all input scores are identical, the softmax assigns equal probability to each class (1/4 = 0.25 for four elements).
The log of 0.25 is ln(0.25) = ln(1/4) = -ln(4) ≈ -1.3863
Alternatively, using the formula: • Sum of exponentials: 4 × e¹ ≈ 10.873 • Log of sum ≈ 2.3863 • Log-softmax: 1 - 2.3863 = -1.3863 for each element
This demonstrates that uniform inputs produce uniform log-probabilities, each equal to -ln(n) where n is the number of elements.
Constraints