00:00:00

Description

Editorial

Skip Connection Residual Unit

EASY10 pts

In deep neural networks, residual learning with skip connections has revolutionized how we train very deep architectures. The core insight behind residual networks (ResNets) is that instead of learning a direct mapping from input to output, the network learns a residual function that represents the difference between the desired output and the input.

A residual unit consists of:

A transformation path (one or more weight layers with activations)
A skip connection (also called an identity shortcut) that bypasses the transformation
An element-wise addition combining the transformation output with the skip connection
A final activation function applied to the sum

The mathematical formulation is:

$$\mathbf{y} = \text{ReLU}(F(\mathbf{x}, {W_i}) + \mathbf{x})$$

Where:

x is the input vector
F(x, {Wᵢ}) represents the residual mapping to be learned (the transformation through weight layers)
The + x term is the skip connection adding back the original input
ReLU is applied as the final activation

Why Skip Connections Matter:

They allow gradients to flow directly through the network during backpropagation
They enable training of networks with hundreds or even thousands of layers
They help mitigate the vanishing gradient problem common in deep networks
They allow the network to learn identity mappings when beneficial

Your Task: Implement a function that computes a simple residual unit using NumPy. The unit should:

Take a 1D input vector x
Pass it through the first weight matrix w1 using matrix-vector multiplication
Apply ReLU activation (max(0, value) for each element)
Pass the result through the second weight matrix w2
Add the original input x via the skip connection
Apply final ReLU activation to produce the output

Example

Input

x = [1.0, 2.0]
w1 = [[1.0, 0.0], [0.0, 1.0]]
w2 = [[0.5, 0.0], [0.0, 0.5]]

Output

[1.5, 3.0]

Explanation

Let's trace through the residual unit step by step:

Step 1: First linear transformation h₁ = w1 @ x = [[1.0, 0.0], [0.0, 1.0]] @ [1.0, 2.0] = [1.0, 2.0]

Step 2: First ReLU activation a₁ = ReLU(h₁) = ReLU([1.0, 2.0]) = [1.0, 2.0] (All values are positive, so ReLU keeps them unchanged)

Step 3: Second linear transformation h₂ = w2 @ a₁ = [[0.5, 0.0], [0.0, 0.5]] @ [1.0, 2.0] = [0.5, 1.0]

Step 4: Skip connection (add original input) sum = h₂ + x = [0.5, 1.0] + [1.0, 2.0] = [1.5, 3.0]

Step 5: Final ReLU activation output = ReLU([1.5, 3.0]) = [1.5, 3.0]

The final output is [1.5, 3.0].

Example

Input

x = [1.0, 0.0, -1.0]
w1 = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]
w2 = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]

Output

[2.0, 0.0, 0.0]

Explanation

With identity weight matrices and a mixed-sign input:

Step 1: First linear transformation h₁ = w1 @ x = I₃ @ [1.0, 0.0, -1.0] = [1.0, 0.0, -1.0] (Identity matrix preserves the input)

Step 2: First ReLU activation a₁ = ReLU([1.0, 0.0, -1.0]) = [1.0, 0.0, 0.0] (ReLU zeros out the negative value -1.0)

Step 3: Second linear transformation h₂ = w2 @ a₁ = I₃ @ [1.0, 0.0, 0.0] = [1.0, 0.0, 0.0]

Step 4: Skip connection sum = h₂ + x = [1.0, 0.0, 0.0] + [1.0, 0.0, -1.0] = [2.0, 0.0, -1.0]

Step 5: Final ReLU activation output = ReLU([2.0, 0.0, -1.0]) = [2.0, 0.0, 0.0]

The final output is [2.0, 0.0, 0.0], demonstrating how ReLU clips negative values at multiple stages.

Example

Input

x = [2.0, -3.0]
w1 = [[1.0, 0.0], [0.0, 1.0]]
w2 = [[0.5, 0.0], [0.0, 0.5]]

Output

[3.0, 0.0]

Explanation

Starting with a partially negative input:

Step 1: First linear transformation h₁ = w1 @ x = [[1.0, 0.0], [0.0, 1.0]] @ [2.0, -3.0] = [2.0, -3.0]

Step 2: First ReLU activation a₁ = ReLU([2.0, -3.0]) = [2.0, 0.0] (The negative value -3.0 is clipped to 0)

Step 3: Second linear transformation h₂ = w2 @ a₁ = [[0.5, 0.0], [0.0, 0.5]] @ [2.0, 0.0] = [1.0, 0.0]

Step 4: Skip connection sum = h₂ + x = [1.0, 0.0] + [2.0, -3.0] = [3.0, -3.0]

Step 5: Final ReLU activation output = ReLU([3.0, -3.0]) = [3.0, 0.0]

The final output is [3.0, 0.0]. Notice how the second element becomes 0 because although the skip connection adds back the original -3.0, the final ReLU clips it to 0.

Accepted0/0·0% Acceptance

Constraints

1 ≤ dimension of x ≤ 100
w1 and w2 are square matrices with dimensions matching the length of x
-10⁶ ≤ x[i] ≤ 10⁶
-10⁶ ≤ w1[i][j], w2[i][j] ≤ 10⁶
Input dimensions are guaranteed to be compatible for all matrix operations
Output values should be rounded to 4 decimal places for comparison

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

x =

[1,2]

w1 =

[[1,0],[0,1]]

w2 =

[[0.5,0],[0,0.5]]

Loading problem...

101

00:00:00

Description

Editorial

Skip Connection Residual Unit

EASY10 pts

A residual unit consists of:

A transformation path (one or more weight layers with activations)
A skip connection (also called an identity shortcut) that bypasses the transformation
An element-wise addition combining the transformation output with the skip connection
A final activation function applied to the sum

The mathematical formulation is:

$$\mathbf{y} = \text{ReLU}(F(\mathbf{x}, {W_i}) + \mathbf{x})$$

Where:

x is the input vector
F(x, {Wᵢ}) represents the residual mapping to be learned (the transformation through weight layers)
The + x term is the skip connection adding back the original input
ReLU is applied as the final activation

Why Skip Connections Matter:

They allow gradients to flow directly through the network during backpropagation
They enable training of networks with hundreds or even thousands of layers
They help mitigate the vanishing gradient problem common in deep networks
They allow the network to learn identity mappings when beneficial

Your Task: Implement a function that computes a simple residual unit using NumPy. The unit should:

Take a 1D input vector x
Pass it through the first weight matrix w1 using matrix-vector multiplication
Apply ReLU activation (max(0, value) for each element)
Pass the result through the second weight matrix w2
Add the original input x via the skip connection
Apply final ReLU activation to produce the output

Example

Input

x = [1.0, 2.0]
w1 = [[1.0, 0.0], [0.0, 1.0]]
w2 = [[0.5, 0.0], [0.0, 0.5]]

Output

[1.5, 3.0]

Explanation

Let's trace through the residual unit step by step:

Step 1: First linear transformation h₁ = w1 @ x = [[1.0, 0.0], [0.0, 1.0]] @ [1.0, 2.0] = [1.0, 2.0]

Step 2: First ReLU activation a₁ = ReLU(h₁) = ReLU([1.0, 2.0]) = [1.0, 2.0] (All values are positive, so ReLU keeps them unchanged)

Step 3: Second linear transformation h₂ = w2 @ a₁ = [[0.5, 0.0], [0.0, 0.5]] @ [1.0, 2.0] = [0.5, 1.0]

Step 4: Skip connection (add original input) sum = h₂ + x = [0.5, 1.0] + [1.0, 2.0] = [1.5, 3.0]

Step 5: Final ReLU activation output = ReLU([1.5, 3.0]) = [1.5, 3.0]

The final output is [1.5, 3.0].

Example

Input

x = [1.0, 0.0, -1.0]
w1 = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]
w2 = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]

Output

[2.0, 0.0, 0.0]

Explanation

With identity weight matrices and a mixed-sign input:

Step 1: First linear transformation h₁ = w1 @ x = I₃ @ [1.0, 0.0, -1.0] = [1.0, 0.0, -1.0] (Identity matrix preserves the input)

Step 2: First ReLU activation a₁ = ReLU([1.0, 0.0, -1.0]) = [1.0, 0.0, 0.0] (ReLU zeros out the negative value -1.0)

Step 3: Second linear transformation h₂ = w2 @ a₁ = I₃ @ [1.0, 0.0, 0.0] = [1.0, 0.0, 0.0]

Step 4: Skip connection sum = h₂ + x = [1.0, 0.0, 0.0] + [1.0, 0.0, -1.0] = [2.0, 0.0, -1.0]

Step 5: Final ReLU activation output = ReLU([2.0, 0.0, -1.0]) = [2.0, 0.0, 0.0]

The final output is [2.0, 0.0, 0.0], demonstrating how ReLU clips negative values at multiple stages.

Example

Input

x = [2.0, -3.0]
w1 = [[1.0, 0.0], [0.0, 1.0]]
w2 = [[0.5, 0.0], [0.0, 0.5]]

Output

[3.0, 0.0]

Explanation

Starting with a partially negative input:

Step 1: First linear transformation h₁ = w1 @ x = [[1.0, 0.0], [0.0, 1.0]] @ [2.0, -3.0] = [2.0, -3.0]

Step 2: First ReLU activation a₁ = ReLU([2.0, -3.0]) = [2.0, 0.0] (The negative value -3.0 is clipped to 0)

Step 3: Second linear transformation h₂ = w2 @ a₁ = [[0.5, 0.0], [0.0, 0.5]] @ [2.0, 0.0] = [1.0, 0.0]

Step 4: Skip connection sum = h₂ + x = [1.0, 0.0] + [2.0, -3.0] = [3.0, -3.0]

Step 5: Final ReLU activation output = ReLU([3.0, -3.0]) = [3.0, 0.0]

The final output is [3.0, 0.0]. Notice how the second element becomes 0 because although the skip connection adds back the original -3.0, the final ReLU clips it to 0.

Accepted0/0·0% Acceptance

Constraints

1 ≤ dimension of x ≤ 100
w1 and w2 are square matrices with dimensions matching the length of x
-10⁶ ≤ x[i] ≤ 10⁶
-10⁶ ≤ w1[i][j], w2[i][j] ≤ 10⁶
Input dimensions are guaranteed to be compatible for all matrix operations
Output values should be rounded to 4 decimal places for comparison

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

x =

[1,2]

w1 =

[[1,0],[0,1]]

w2 =

[[0.5,0],[0,0.5]]

Skip Connection Residual Unit

Hints

Skip Connection Residual Unit

Hints