0/318

00:00:00

Description

Editorial

Glorot/Xavier Weight Initialization

MEDIUM30 pts

In deep learning, the initial values assigned to neural network weights play a crucial role in determining whether a model trains successfully or fails to converge. The Glorot initialization scheme (also known as Xavier initialization) is a widely adopted weight initialization strategy designed to maintain consistent variance of activations and gradients across network layers.

The Problem with Random Initialization

When weights are initialized with values that are too large, activations can explode as they propagate through the network, leading to numerical overflow. Conversely, when weights are too small, gradients become vanishingly tiny during backpropagation, effectively preventing any learning from occurring. Both scenarios result in training failure.

The Glorot Solution

The Glorot initialization method addresses this by scaling the initial weights based on the fan-in (number of input connections) and fan-out (number of output connections) of each layer. This scaling ensures that the variance of activations remains approximately constant as signals flow forward through the network, and that gradient magnitudes remain stable during backpropagation.

Initialization Modes

Your implementation should support two distribution modes:

Uniform Mode

Initialize weights by sampling from a uniform distribution U(-limit, limit), where:

$$\text{limit} = \sqrt{\frac{6}{\text{fan_in} + \text{fan_out}}}$$

Normal Mode

Initialize weights by sampling from a normal (Gaussian) distribution N(0, σ²), where the standard deviation is:

$$\sigma = \sqrt{\frac{2}{\text{fan_in} + \text{fan_out}}}$$

Your Task

Write a Python function that implements the Glorot weight initialization scheme. The function should:

Accept a shape tuple (fan_in, fan_out) specifying the weight matrix dimensions
Accept a mode parameter indicating either 'uniform' or 'normal' distribution
Accept an optional seed parameter for reproducible random number generation
Return a NumPy array containing the initialized weights, with all values rounded to 4 decimal places

This initialization technique is fundamental to modern deep learning and is the default weight initialization method in many popular frameworks.

Example

Input

shape = (3, 3)
mode = 'uniform'
seed = 42

Output

[[-0.2509, 0.9014, 0.464], [0.1973, -0.688, -0.688], [-0.8838, 0.7324, 0.2022]]

Explanation

For a 3×3 weight matrix with uniform initialization:

• fan_in = 3, fan_out = 3 • limit = √(6 / (3 + 3)) = √(6 / 6) = √1 = 1.0 • Weights are sampled from U(-1.0, 1.0)

Using seed=42 for reproducibility, NumPy's random generator produces values within this range. After rounding to 4 decimal places, the resulting weight matrix preserves the variance properties needed for stable training.

Example

Input

shape = (2, 2)
mode = 'normal'
seed = 42

Output

[[0.3512, -0.0978], [0.458, 1.0769]]

Explanation

For a 2×2 weight matrix with normal initialization:

• fan_in = 2, fan_out = 2 • σ = √(2 / (2 + 2)) = √(2 / 4) = √0.5 ≈ 0.7071 • Weights are sampled from N(0, 0.7071²)

Using seed=42, the random generator produces normally distributed values with the calculated standard deviation. The resulting matrix maintains appropriate variance for gradient flow during training.

Example

Input

shape = (3, 4)
mode = 'uniform'
seed = 42

Output

[[-0.2323, 0.8346, 0.4296, 0.1827], [-0.6369, -0.637, -0.8183, 0.678], [0.1872, 0.3853, -0.8877, 0.8701]]

Explanation

For a 3×4 weight matrix (asymmetric dimensions) with uniform initialization:

• fan_in = 3, fan_out = 4 • limit = √(6 / (3 + 4)) = √(6 / 7) ≈ 0.9258 • Weights are sampled from U(-0.9258, 0.9258)

This example demonstrates how asymmetric layer sizes affect the initialization range. The limit adapts to account for the different numbers of input and output connections, ensuring balanced variance regardless of layer shape.

Accepted0/0·0% Acceptance

Constraints

1 ≤ fan_in ≤ 1000 (number of input units)
1 ≤ fan_out ≤ 1000 (number of output units)
mode ∈ {'uniform', 'normal'}
seed is a non-negative integer when provided
Output values must be rounded to exactly 4 decimal places
The function must use NumPy for random number generation

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

mode =

"normal"

seed =

shape =

[2,2]

The Problem with Random Initialization

The Glorot Solution

Initialization Modes

Your implementation should support two distribution modes:

Uniform Mode

Initialize weights by sampling from a uniform distribution U(-limit, limit), where:

$$\text{limit} = \sqrt{\frac{6}{\text{fan_in} + \text{fan_out}}}$$

Normal Mode

Initialize weights by sampling from a normal (Gaussian) distribution N(0, σ²), where the standard deviation is:

$$\sigma = \sqrt{\frac{2}{\text{fan_in} + \text{fan_out}}}$$

Your Task

Write a Python function that implements the Glorot weight initialization scheme. The function should:

Accept a shape tuple (fan_in, fan_out) specifying the weight matrix dimensions

Accept a mode parameter indicating either 'uniform' or 'normal' distribution

Accept an optional seed parameter for reproducible random number generation

Return a NumPy array containing the initialized weights, with all values rounded to 4 decimal places

This initialization technique is fundamental to modern deep learning and is the default weight initialization method in many popular frameworks.

Glorot/Xavier Weight Initialization

The Problem with Random Initialization

The Glorot Solution

Initialization Modes

Uniform Mode

Normal Mode

Your Task

Hints

Glorot/Xavier Weight Initialization

The Problem with Random Initialization

The Glorot Solution

Initialization Modes

Uniform Mode

Normal Mode

Your Task

Hints