0/318

00:00:00

Description

Editorial

Gradient Computation Engine (Basic Autograd)

MEDIUM30 pts

At the heart of every modern deep learning framework lies an automatic differentiation engine (commonly called autograd). This powerful mechanism automatically computes gradients of complex mathematical expressions—enabling neural networks to learn through backpropagation without requiring engineers to manually derive and implement gradient formulas.

In this problem, you will build a simplified version of such an engine by implementing a Value class that wraps scalar values and tracks the operations performed on them. Your implementation must support three core operations:

Supported Operations

1. Addition (+)

When two Value objects are added, the result is a new Value containing their sum. During backpropagation, the gradient flows equally to both operands:

$$\frac{\partial (a + b)}{\partial a} = 1, \quad \frac{\partial (a + b)}{\partial b} = 1$$

2. Multiplication (*)

When two Value objects are multiplied, the result is their product. The gradients follow the product rule:

$$\frac{\partial (a \cdot b)}{\partial a} = b, \quad \frac{\partial (a \cdot b)}{\partial b} = a$$

3. ReLU Activation

The Rectified Linear Unit activation function returns the input if it's positive, or zero otherwise:

$$\text{ReLU}(x) = \max(0, x)$$

The gradient of ReLU is:

$$\frac{\partial \text{ReLU}(x)}{\partial x} = \begin{cases} 1 & \text{if } x > 0 \ 0 & \text{if } x \leq 0 \end{cases}$$

Backpropagation

Your Value class must implement a backward() method that performs backpropagation from the output node back through the entire computational graph. This involves:

Setting the gradient of the output node to 1.0 (representing ∂output/∂output = 1)
Traversing the graph in reverse topological order
Accumulating gradients at each node using the chain rule

The Chain Rule: If y = f(g(x)), then dy/dx = (dy/dg) × (dg/dx)

Your Task: Implement the Value class with __init__, __add__, __mul__, relu, and backward methods. After calling backward() on the final output, each Value object's grad attribute should contain the gradient of the output with respect to that value.

Example

Input

values = {'a': 2, 'b': -3, 'c': 10}
expression = "(a + b * c).relu()"

Output

{'a': {'data': 2, 'grad': 0.0}, 'b': {'data': -3, 'grad': 0.0}, 'c': {'data': 10, 'grad': 0.0}}

Explanation

Let's trace through the computation:

Forward Pass: • b * c = (-3) × 10 = -30 • a + (b * c) = 2 + (-30) = -28 • ReLU(-28) = max(0, -28) = 0

Backward Pass: Starting with output gradient = 1.0: • The ReLU output is 0 (since -28 ≤ 0), so the ReLU gradient is 0 • This means zero gradient flows backward to all inputs • Therefore: grad(a) = 0, grad(b) = 0, grad(c) = 0

The ReLU "kills" the gradient because the input was negative, blocking all gradient flow to the upstream variables.

Example

Input

values = {'a': 3, 'b': 4}
expression = "a + b"

Output

{'a': {'data': 3, 'grad': 1.0}, 'b': {'data': 4, 'grad': 1.0}}

Explanation

Forward Pass: • a + b = 3 + 4 = 7

Backward Pass: Starting with output gradient = 1.0: • For addition: ∂(a+b)/∂a = 1 and ∂(a+b)/∂b = 1 • Applying chain rule: grad(a) = 1.0 × 1 = 1.0, grad(b) = 1.0 × 1 = 1.0

Both inputs contribute equally to the sum, so they receive equal gradients.

Example

Input

values = {'a': 2, 'b': 5}
expression = "a * b"

Output

{'a': {'data': 2, 'grad': 5.0}, 'b': {'data': 5, 'grad': 2.0}}

Explanation

Forward Pass: • a * b = 2 × 5 = 10

Backward Pass: Starting with output gradient = 1.0: • For multiplication: ∂(ab)/∂a = b = 5 and ∂(ab)/∂b = a = 2 • Applying chain rule: grad(a) = 1.0 × 5 = 5.0, grad(b) = 1.0 × 2 = 2.0

The gradient of each variable in a product equals the value of the other variable. This reflects how much the output changes when each input changes by a small amount.

Accepted0/0·0% Acceptance

Constraints

-10⁶ ≤ data values ≤ 10⁶
The computational graph will be a directed acyclic graph (DAG)
Expressions will only use addition (+), multiplication (*), and ReLU operations
All leaf values in the expression must be defined in the values dictionary
The expression will always produce a single scalar output
Gradients should be computed with floating-point precision

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

values =

{"a":2,"b":-3,"c":10}

expression =

"(a + b * c).relu()"

Supported Operations

1. Addition (+)

When two Value objects are added, the result is a new Value containing their sum. During backpropagation, the gradient flows equally to both operands:

$$\frac{\partial (a + b)}{\partial a} = 1, \quad \frac{\partial (a + b)}{\partial b} = 1$$

2. Multiplication (*)

When two Value objects are multiplied, the result is their product. The gradients follow the product rule:

$$\frac{\partial (a \cdot b)}{\partial a} = b, \quad \frac{\partial (a \cdot b)}{\partial b} = a$$

3. ReLU Activation

The Rectified Linear Unit activation function returns the input if it's positive, or zero otherwise:

$$\text{ReLU}(x) = \max(0, x)$$

The gradient of ReLU is:

$$\frac{\partial \text{ReLU}(x)}{\partial x} = \begin{cases} 1 & \text{if } x > 0 \ 0 & \text{if } x \leq 0 \end{cases}$$

Backpropagation

Your Value class must implement a backward() method that performs backpropagation from the output node back through the entire computational graph. This involves:

Setting the gradient of the output node to 1.0 (representing ∂output/∂output = 1)

Traversing the graph in reverse topological order

Accumulating gradients at each node using the chain rule

The Chain Rule: If y = f(g(x)), then dy/dx = (dy/dg) × (dg/dx)

Gradient Computation Engine (Basic Autograd)

Supported Operations

1. Addition (+)

2. Multiplication (*)

3. ReLU Activation

Backpropagation

Hints

Gradient Computation Engine (Basic Autograd)

Supported Operations

1. Addition (+)

2. Multiplication (*)

3. ReLU Activation

Backpropagation

Hints