Loading content...
Understanding the number of trainable parameters in a neural network is crucial for deep learning practitioners. The parameter count directly impacts memory requirements, computational costs, training time, and the model's capacity to learn complex patterns. This metric is essential for model architecture design, deployment optimization, and resource planning.
Given a neural network's architectural specification as a list of layer configurations, your task is to compute the total count of trainable parameters across all layers.
Supported Layer Types:
A dense layer connects every input neuron to every output neuron through learnable weights.
Configuration Keys:
"type": "dense""input_size": Number of input features (integer)"output_size": Number of output neurons (integer)"use_bias": Whether to include a bias term (boolean)Parameter Calculation: $$\text{Parameters} = (\text{input_size} \times \text{output_size}) + \begin{cases} \text{output_size} & \text{if use_bias is True} \ 0 & \text{otherwise} \end{cases}$$
The weight matrix has dimensions (input_size × output_size), and the optional bias vector has length output_size.
A convolutional layer applies learnable filters across spatial dimensions of the input.
Configuration Keys:
"type": "conv2d""in_channels": Number of input channels/feature maps (integer)"out_channels": Number of filters/output channels (integer)"kernel_size": Spatial dimensions of each filter (integer for square kernels, or tuple for non-square)"use_bias": Whether to include a bias term per output channel (boolean)Parameter Calculation: $$\text{Parameters} = (\text{kernel_height} \times \text{kernel_width} \times \text{in_channels} \times \text{out_channels}) + \begin{cases} \text{out_channels} & \text{if use_bias is True} \ 0 & \text{otherwise} \end{cases}$$
Each filter has spatial dimensions (kernel_height × kernel_width) with depth equal to in_channels.
An embedding layer maps discrete indices to dense vector representations through a learnable lookup table.
Configuration Keys:
"type": "embedding""num_embeddings": Size of the vocabulary/dictionary (integer)"embedding_dim": Dimension of each embedding vector (integer)Parameter Calculation: $$\text{Parameters} = \text{num_embeddings} \times \text{embedding_dim}$$
The embedding matrix has dimensions (num_embeddings × embedding_dim).
Your Task: Implement a function that iterates through the layer specifications and computes the cumulative parameter count for the entire network architecture.
layers = [{"type": "dense", "input_size": 784, "output_size": 128, "use_bias": true}, {"type": "dense", "input_size": 128, "output_size": 10, "use_bias": true}]101770This is a classic MNIST classifier architecture with two fully-connected layers.
Layer 1 (Dense 784→128): • Weight matrix: 784 × 128 = 100,352 parameters • Bias vector: 128 parameters • Layer total: 100,352 + 128 = 100,480 parameters
Layer 2 (Dense 128→10): • Weight matrix: 128 × 10 = 1,280 parameters • Bias vector: 10 parameters • Layer total: 1,280 + 10 = 1,290 parameters
Network Total: 100,480 + 1,290 = 101,770 parameters
layers = [{"type": "conv2d", "in_channels": 3, "out_channels": 32, "kernel_size": 3, "use_bias": true}]896A single convolutional layer processing RGB images (3 channels) with 32 output filters using 3×3 kernels.
Conv2D Layer (3→32, kernel 3×3): • Each filter has dimensions: 3 × 3 × 3 = 27 parameters • Number of filters: 32 • Total filter weights: 27 × 32 = 864 parameters • Bias terms (one per output channel): 32 parameters • Layer total: 864 + 32 = 896 parameters
layers = [{"type": "embedding", "num_embeddings": 1000, "embedding_dim": 64}, {"type": "dense", "input_size": 64, "output_size": 10, "use_bias": true}]64650A text classification model with an embedding layer for vocabulary lookup followed by a classification dense layer.
Embedding Layer (vocab=1000, dim=64): • Embedding matrix: 1,000 × 64 = 64,000 parameters
Dense Layer (64→10): • Weight matrix: 64 × 10 = 640 parameters • Bias vector: 10 parameters • Layer total: 640 + 10 = 650 parameters
Network Total: 64,000 + 650 = 64,650 parameters
Constraints