Self-Gated Neural Activator (Easy) — Practice with Code Visualizer

In deep learning, activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns and representations. Among the family of modern activations, the self-gated activation function has emerged as a particularly effective alternative to traditional functions like ReLU.

The self-gated activation function works by multiplying the input by its own sigmoid-transformed value, creating a self-regulating gate that controls the information flow:

$$f(x) = x \cdot \sigma(x) = \frac{x}{1 + e^{-x}}$$

where σ(x) is the sigmoid function: $\sigma(x) = \frac{1}{1 + e^{-x}}$

Key Characteristics:

Smooth and Non-Monotonic: Unlike ReLU which has a sharp corner at zero, this function is infinitely differentiable, providing smoother gradients during backpropagation.
Self-Gating Mechanism: The output is bounded below (approaching 0 for large negative inputs) but unbounded above, with the sigmoid component acting as a soft gate that modulates the input.
Non-Zero for Negative Inputs: Unlike ReLU which outputs zero for all negative inputs (potentially causing "dying neurons"), this function can output small negative values, maintaining gradient flow.
Asymptotic Behavior:
- As x → +∞, f(x) → x (behaves like identity)
- As x → -∞, f(x) → 0 (approaches zero smoothly)

Your Task: Write a Python function that computes the self-gated activation value for a given input. The function should return the result rounded to 4 decimal places for numerical precision.

For x = 1.0:

First, compute the sigmoid: σ(1.0) = 1 / (1 + e⁻¹) = 1 / (1 + 0.3679) ≈ 0.7311
Then, multiply by the input: f(1.0) = 1.0 × 0.7311 = 0.7311

Notice that for positive inputs, the self-gated activation returns a value close to but slightly less than the input itself, due to the gating mechanism (sigmoid is < 1 for finite values).

For x = 2.0:

Compute the sigmoid: σ(2.0) = 1 / (1 + e⁻²) = 1 / (1 + 0.1353) ≈ 0.8808
Multiply by input: f(2.0) = 2.0 × 0.8808 = 1.7616

As the input increases, the sigmoid approaches 1, so the output gets closer to the identity function. Here, f(2.0) ≈ 0.88 × 2.0, showing the gate is nearly fully open.

For x = -1.0:

Compute the sigmoid: σ(-1.0) = 1 / (1 + e¹) = 1 / (1 + 2.7183) ≈ 0.2689
Multiply by input: f(-1.0) = -1.0 × 0.2689 = -0.2689

This demonstrates a key advantage over ReLU: negative inputs produce small negative outputs rather than being completely zeroed out. This preserves gradient flow and prevents the 'dying neuron' problem.