Machine LearningNeural Network Foundations

Biological Inspiration

LevelIntermediate

Duration90 mins

TopicNeural Network Foundations

1 / 5

Biological Neurons

Nature's Computing Machinery

The human brain is perhaps the most sophisticated information processing system in the known universe. With approximately 86 billion neurons connected through an estimated 100 trillion synapses, it performs feats of pattern recognition, reasoning, and learning that still surpass the most advanced artificial systems in many domains.

Understanding how the brain computes is not merely an academic curiosity—it is the foundational inspiration that launched the entire field of neural networks and deep learning. To truly understand artificial neural networks, we must first appreciate the biological machinery they attempt to emulate.

What You Will Learn

By the end of this page, you will understand the fundamental architecture and operation of biological neurons—from their anatomical structure to their electrochemical signaling mechanisms. This knowledge provides essential context for understanding why artificial neurons are designed the way they are, and what biological features they capture or ignore.

The Neuron Doctrine

The modern understanding of neural computation begins with the Neuron Doctrine, established in the late 19th century primarily through the work of Santiago Ramón y Cajal and Camillo Golgi. Before their discoveries, scientists debated whether the nervous system was a continuous mesh (the reticular theory) or composed of discrete units.

Ramón y Cajal, using Golgi's staining technique, demonstrated definitively that the nervous system consists of individual, structurally distinct cells—neurons—that communicate with each other at specialized junctions. This insight was revolutionary: it meant that neural computation could be understood as the collective behavior of discrete computational units, each processing and transmitting information.

Historical Significance

Ramón y Cajal and Golgi shared the 1906 Nobel Prize in Physiology or Medicine for their work on the structure of the nervous system—despite Golgi never fully accepting the neuron doctrine. This foundation underlies every neural network we build today: the idea that intelligence emerges from networks of simple, discrete computational units.

The key principles of the Neuron Doctrine:

Structural Independence: Neurons are discrete anatomical units with distinct boundaries, not fused into a continuous network
Functional Independence: Each neuron operates as an independent information-processing unit
Connectivity Through Synapses: Neurons communicate through specialized junctions called synapses, where information passes from one neuron to another
Directional Signal Flow: In most cases, information flows in one direction—from dendrites to axon terminals (though we now know there are exceptions)

These principles directly inform the design of artificial neural networks, where we model neurons as discrete units connected through weighted edges that transmit signals in a specified direction.

Anatomy of a Biological Neuron

A typical neuron consists of three main anatomical regions, each playing a distinct role in neural computation:

The Cell Body (Soma)

The soma is the metabolic center of the neuron, containing the nucleus and the molecular machinery required for the cell's survival. But it also serves a critical computational function: it integrates incoming signals from all connected neurons.

The soma is typically 10-100 micrometers in diameter. Its membrane maintains a resting potential of approximately -70 millivolts (mV) relative to the extracellular fluid. This voltage difference—created by ion pumps that maintain unequal concentrations of sodium (Na⁺), potassium (K⁺), and other ions across the membrane—is the foundation of neural signaling.

The Dendrites

Dendrites are tree-like branching structures that extend from the soma and serve as the primary input structures of the neuron. The word 'dendrite' comes from the Greek word for 'tree,' reflecting their branching morphology.

Key properties of dendrites:

Extensive branching: A single neuron may have thousands of dendritic branches, allowing it to receive inputs from many other neurons
Dendritic spines: Small protrusions on dendrites where most excitatory synapses form
Passive signal propagation: Unlike axons, dendrites typically propagate signals passively (with signal decay), though active dendritic computation has been discovered
Surface area maximization: The branching structure maximizes the surface area available for receiving synaptic inputs

Comparison of Major Neuronal Structures
Structure	Primary Function	Signal Type	Typical Size
Dendrites	Receive input signals	Graded potentials (passive)	Up to 2mm total length
Soma (Cell Body)	Integrate signals, cell maintenance	Integration zone	10-100 μm diameter
Axon Hillock	Action potential initiation	Threshold detection	~1 μm
Axon	Transmit output signal	Action potentials (digital)	1 μm to 1+ meter
Axon Terminals	Release neurotransmitters	Chemical transmission	1-5 μm

The Axon

The axon is a single, long projection that carries the neuron's output signal away from the soma to other neurons. While each neuron has only one axon, that axon may branch extensively near its target region.

Critical axon properties:

Signal regeneration: Unlike passive dendritic propagation, axons actively regenerate signals using voltage-gated ion channels, allowing transmission over long distances without decay
Myelination: Many axons are wrapped in myelin—a fatty insulating sheath produced by glial cells—that dramatically increases signal transmission speed through saltatory conduction
All-or-nothing transmission: Axons carry action potentials, which are stereotyped electrical pulses that either fire fully or not at all
Extreme length variation: Axons can range from less than a millimeter (in local interneurons) to over a meter (motor neurons reaching from the spinal cord to the toes)

The Axon Terminals (Synaptic Boutons)

At its target, the axon branches into many axon terminals (also called synaptic boutons). These specialized structures contain vesicles filled with neurotransmitters—chemical messengers that carry the signal across the synaptic cleft to the next neuron.

When an action potential reaches an axon terminal:

Voltage-gated calcium channels open
Calcium influx triggers vesicle fusion with the membrane
Neurotransmitters are released into the synaptic cleft
Neurotransmitters bind to receptors on the postsynaptic neuron
The postsynaptic neuron's membrane potential changes

Mapping to Artificial Neurons

Notice how the biological neuron's architecture suggests a computational model: multiple inputs (dendrites) are integrated (soma), and if the combined input exceeds a threshold (axon hillock), an output is generated (axon). This forms the conceptual basis for the artificial neuron model we'll explore in the next page.

Electrochemical Signaling

Neurons are fundamentally electrochemical devices. They use both electrical signaling (within the neuron) and chemical signaling (between neurons) to process and transmit information. Understanding this dual nature is crucial for appreciating what artificial neurons simplify or abstract away.

The Resting Membrane Potential

At rest, a neuron maintains a voltage difference of approximately -70 mV across its membrane (inside negative relative to outside). This resting potential is established and maintained by:

The sodium-potassium pump (Na⁺/K⁺-ATPase): This active transport protein uses ATP to pump 3 Na⁺ ions out and 2 K⁺ ions in, creating concentration gradients
Ion channel selectivity: The membrane at rest is more permeable to K⁺ than Na⁺, so potassium tends to leak out, making the inside more negative
Electrostatic forces: The resulting charge separation creates an electrical gradient that eventually balances the concentration gradient

The resting potential represents a state of dynamic equilibrium—ions are constantly moving, but the net voltage remains stable. This potential energy, like a cocked spring, is what allows rapid neural signaling.

Graded Potentials: Analog Input Signals

When neurotransmitters bind to receptors on dendrites, they cause graded potentials—changes in membrane voltage that vary in amplitude based on the strength of the input. These are the neuron's analog input signals.

Properties of graded potentials:

Amplitude varies continuously with stimulus strength (unlike all-or-nothing action potentials)
Decremental propagation: Signal strength decreases with distance from the source (passive spread)
Temporal and spatial summation: Multiple graded potentials can add together
- Temporal summation: Rapid successive inputs at the same location combine
- Spatial summation: Simultaneous inputs at different locations combine
Bidirectional: Can be either depolarizing (making membrane potential less negative, excitatory) or hyperpolarizing (more negative, inhibitory)

Excitatory postsynaptic potentials (EPSPs): Depolarize the membrane toward the threshold, increasing the probability of firing

Inhibitory postsynaptic potentials (IPSPs): Hyperpolarize the membrane away from threshold, decreasing firing probability

The soma integrates all incoming EPSPs and IPSPs—essentially performing a weighted sum of all inputs, where the weights depend on synapse strength, location, and timing. This is precisely what artificial neurons model with their weighted sum operation.

The Integration Function

The biological neuron's integration of EPSPs and IPSPs directly inspired the weighted sum operation in artificial neurons: Σᵢ wᵢxᵢ. Excitatory inputs correspond to positive weights, inhibitory inputs to negative weights. The strength of a synapse maps to the weight magnitude.

The Action Potential: Digital Output Signal

If the integrated graded potentials at the axon hillock (the junction between soma and axon) exceed approximately -55 mV (the threshold potential), an action potential is triggered. The action potential is the neuron's digital output signal—it either fires completely or not at all.

The action potential sequence:

Threshold reached: Membrane potential at axon hillock reaches ~-55 mV
Rapid depolarization: Voltage-gated Na⁺ channels open → Na⁺ rushes in → membrane potential shoots to ~+30 mV (within 1 ms)
Repolarization: Na⁺ channels inactivate; voltage-gated K⁺ channels open → K⁺ rushes out → membrane potential returns toward rest
Hyperpolarization: K⁺ channels close slowly → membrane briefly overshoots to ~-80 mV (refractory period)
Return to rest: Na⁺/K⁺ pumps restore resting potential; neuron ready to fire again

Key properties of action potentials:

All-or-nothing: Same amplitude regardless of stimulus strength above threshold
Self-propagating: Active regeneration allows travel over long distances without decay
Refractory period: Brief period after firing when the neuron cannot fire again, limiting maximum firing rate to ~500-1000 Hz
Information in timing: Since amplitude is fixed, information is encoded in firing rate and precise spike timing

Comparison: Graded Potentials vs Action Potentials
Property	Graded Potentials	Action Potentials
Amplitude	Variable (proportional to input)	Fixed (~100 mV total swing)
Propagation	Passive, decremental	Active, regenerative
Distance	Short (millimeters)	Long (up to meters)
Summation	Yes (temporal and spatial)	No (all-or-nothing)
Direction	Bidirectional	Unidirectional (axon → terminals)
Neural analog	Weighted sum input	Thresholded output
Artificial analog	Σ wᵢxᵢ	Activation function output

Synaptic Transmission

The synapse is where the computation really happens. It's the interface between neurons—the point where one neuron's output becomes another neuron's input. Understanding synapses is crucial because synaptic weights are what artificial neural networks adjust during learning.

The Chemical Synapse

Most synapses in the brain are chemical synapses, where information is transmitted via neurotransmitter molecules. The synapse consists of three parts:

Presynaptic terminal: The axon terminal of the sending neuron, containing vesicles filled with neurotransmitters
Synaptic cleft: A 20-40 nanometer gap between neurons filled with extracellular fluid
Postsynaptic membrane: The receiving neuron's membrane (usually a dendritic spine), containing neurotransmitter receptors

The synaptic transmission process:

Action potential arrives at presynaptic terminal
Voltage-gated Ca²⁺ channels open; calcium enters
Calcium triggers vesicle fusion with the membrane
Neurotransmitters are released into the synaptic cleft
Neurotransmitters diffuse across the cleft (~0.1 ms)
Neurotransmitters bind to postsynaptic receptors
Receptors trigger changes in postsynaptic membrane potential
Neurotransmitters are cleared (reuptake, enzymatic breakdown, or diffusion)

Neurotransmitters and Receptor Types

Major neurotransmitters in the brain:

Glutamate: The primary excitatory neurotransmitter. Binds to AMPA and NMDA receptors. Responsible for most fast excitatory transmission.
GABA (γ-aminobutyric acid): The primary inhibitory neurotransmitter. Binds to GABA receptors. Critical for preventing runaway excitation.
Dopamine: Involved in reward, motivation, and learning. Central to reinforcement learning circuits.
Acetylcholine: Important for attention and memory. Used at neuromuscular junctions.
Serotonin: Modulates mood, sleep, and various cognitive functions.

Receptor types:

Ionotropic receptors: Fast-acting. Neurotransmitter binding directly opens an ion channel. Response in milliseconds.
Metabotropic receptors: Slower but longer-lasting. Neurotransmitter binding triggers intracellular signaling cascades. Response over seconds to minutes.

Electrical Synapses (Gap Junctions)

A minority of synapses are electrical synapses, where neurons are connected by gap junctions—protein channels that directly link the cytoplasm of adjacent neurons. These allow direct electrical coupling:

Extremely fast: No chemical delay (~0.1 ms vs 0.5-2 ms for chemical synapses)
Bidirectional: Current can flow either direction
Limited plasticity: Less capacity for learning compared to chemical synapses
Used for synchronization: Common in circuits requiring rapid, coordinated activity

Synaptic Weights in Artificial Networks

The strength of a synapse—determined by factors like the number of neurotransmitter receptors, vesicle release probability, and receptor sensitivity—is what we model as 'weights' in artificial neural networks. When we train a neural network by adjusting weights, we're mimicking the biological process of synaptic strengthening and weakening.

Synaptic Plasticity: The Basis of Learning

How does the brain learn? The answer lies in synaptic plasticity—the ability of synapses to change their strength based on activity patterns. This is the biological foundation for all learning in neural networks, both biological and artificial.

Hebbian Learning: "Fire Together, Wire Together"

In 1949, psychologist Donald Hebb proposed a theory of learning that remains central to both neuroscience and machine learning:

"When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased."

More succinctly: "Neurons that fire together, wire together."

Hebb's rule suggests that if a presynaptic neuron repeatedly contributes to firing a postsynaptic neuron, the connection between them should strengthen. This provides a mechanism for associative learning—for example, why repeatedly seeing a face and hearing a name together causes you to associate them.

Long-Term Potentiation (LTP)

LTP is the primary experimental paradigm for studying synaptic strengthening. Discovered in 1973 by Bliss and Lømo in the hippocampus, LTP demonstrates that synapses can undergo long-lasting increases in transmission efficacy.

Key properties of LTP:

Input specificity: Only activated synapses are strengthened, not all synapses on a neuron
Cooperativity: Weak stimulation of few inputs is insufficient; many inputs must be active together
Associativity: A weak input can be strengthened if active simultaneously with a strong input
Persistence: Effects can last for hours, days, or even longer

Molecular mechanism:

Strong, repeated presynaptic activity releases glutamate
Glutamate activates AMPA receptors → fast depolarization
Depolarization relieves Mg²⁺ block of NMDA receptors
NMDA receptors open → Ca²⁺ enters the postsynaptic cell
Calcium activates kinases (CaMKII, PKC)
Kinases phosphorylate AMPA receptors → increased conductance
New AMPA receptors are inserted into the membrane
Long-term: Gene expression changes → structural modifications

The NMDA receptor acts as a coincidence detector—it only opens when there is both presynaptic activity (glutamate release) AND postsynaptic depolarization (glutamate binding + membrane depolarization). This implements Hebb's rule at the molecular level.

Connection to Backpropagation

While backpropagation in artificial neural networks differs mechanistically from LTP, both implement the same core principle: connection strengths change based on the correlation between connected neurons' activities. Backpropagation uses gradient information to determine which direction to change weights; LTP uses local coincidence detection.

Long-Term Depression (LTD)

LTD is the opposite of LTP—a long-lasting decrease in synaptic strength. It's equally important for learning, as it allows the brain to:

Prevent saturation: Without LTD, all synapses would eventually max out
Forget outdated associations: Clear old patterns to make room for new ones
Refine representations: Pruning weak connections sharpen learned patterns

Induction of LTD:

Low-frequency stimulation (1 Hz for 10-15 minutes)
Low, prolonged postsynaptic calcium elevation (vs. high, brief for LTP)
Activation of phosphatases instead of kinases
Removal of AMPA receptors from the membrane

Spike-Timing-Dependent Plasticity (STDP)

A more refined view of Hebbian learning emerged from STDP experiments in the 1990s:

If the presynaptic spike occurs before the postsynaptic spike (within ~20 ms), the synapse is strengthened (causal relationship → potentiation)
If the presynaptic spike occurs after the postsynaptic spike, the synapse is weakened (acausal relationship → depression)

This timing dependence implements a form of causality detection—the synapse asks "Did the presynaptic neuron help cause the postsynaptic neuron to fire?" If yes, strengthen; if no, weaken.

The STDP learning rule can be approximated as:

Δw = η × (A₊ × e^(-Δt/τ₊)) if Δt > 0 (pre before post) Δw = -η × (A₋ × e^(Δt/τ₋)) if Δt < 0 (pre after post)

Where Δt = t_post - t_pre, and A₊, A₋, τ₊, τ₋ are parameters controlling the magnitude and time constants of potentiation and depression.

Neural Coding: How Information is Represented

Given that action potentials are all-or-nothing events with stereotyped waveforms, how does the brain encode information? This question of neural coding is fundamental to understanding biological computation and has implications for how we design artificial networks.

Rate Coding

The most straightforward coding scheme is rate coding, where information is encoded in the firing rate of neurons—the number of action potentials per unit time.

Evidence for rate coding:

Sensory neurons: firing rate often increases monotonically with stimulus intensity
Motor neurons: firing rate controls muscle contraction strength
Many neurons show tuning curves where firing rate peaks for preferred stimuli

Mathematical representation:

r = f(I)

Where r is the firing rate and f(I) is some function of the input I. This directly corresponds to an artificial neuron's output activation.

Limitations of rate coding:

Slow: Requires time to estimate the rate (typically 50-100 ms)
Cannot explain rapid processing: Humans can recognize images in ~100 ms, but this allows few spikes per neuron
Ignores temporal structure: Treats spike trains as a simple count

Temporal Coding

Temporal coding hypothesizes that precise spike timing carries information beyond just the firing rate.

Forms of temporal coding:

Latency coding: Information in the time to first spike after stimulus onset (observed in visual cortex, olfactory system)
Phase coding: Spike timing relative to ongoing oscillations (observed in hippocampus during navigation)
Synchrony coding: Information in which neurons fire together (observed in sensory binding)
Temporal patterns: Specific sequences of interspike intervals (observed in songbird communication)

Evidence for temporal coding:

Neurons can discriminate timing differences as small as 1-2 ms
Neural synchrony correlates with attention and binding
Some tasks require temporal precision beyond what rate coding allows

Population Coding

Population coding recognizes that single neurons are noisy and limited; information is more reliably represented by populations of neurons.

Examples:

Place cells in hippocampus: Each neuron has a preferred location; the animal's position is encoded by which neurons are active and how strongly
Motor cortex population vectors: Direction of arm movement is encoded by a weighted sum of many neurons' preferred directions
Distributed representations: Concepts, objects, and categories are represented by patterns of activity across many neurons, not single 'grandmother cells'

This perspective directly informs artificial neural network design, where we use layers of many neurons to create distributed representations.

Artificial Neural Network Simplification

Standard artificial neurons abstract away temporal dynamics entirely—they compute continuous-valued activations rather than spike trains. Rate coding provides the primary justification: if information is encoded in firing rates, we can model a neuron's output as proportional to its rate, avoiding the complexity of spiking dynamics. Spiking neural networks (SNNs), which model spike timing explicitly, are an active research area that may offer computational and efficiency advantages.

Types of Neurons in the Nervous System

The brain contains a remarkable diversity of neuron types, each specialized for particular computational roles. Understanding this diversity helps us appreciate what artificial neural networks simplify and what future architectures might incorporate.

Classification by Function

Sensory neurons (afferent neurons):

Transmit information from sensory receptors to the central nervous system
Examples: photoreceptors in the retina, hair cells in the cochlea, mechanoreceptors in skin
Specialized input structures matched to specific stimulus types

Motor neurons (efferent neurons):

Transmit commands from the central nervous system to muscles and glands
Large cell bodies and long axons reaching peripheral targets
Final common pathway for all voluntary movement

Interneurons:

Connect neurons within the central nervous system
Comprise the vast majority of neurons in the brain
Perform information processing, comparison, and integration
Range from local (within a brain region) to projection (between regions)

Major Neuron Types by Morphology
Type	Structure	Location	Function
Pyramidal cells	Large, pyramid-shaped soma; long apical dendrite	Cerebral cortex, hippocampus	Primary excitatory neurons; cortical computation
Purkinje cells	Very large; elaborate dendritic tree in single plane	Cerebellar cortex	Motor learning and coordination
Granule cells	Very small; few dendrites	Cerebellum, hippocampus	Pattern separation; most numerous neuron type
Stellate cells	Star-shaped dendritic tree	Cortex, cerebellum	Local inhibition
Basket cells	Axons form 'baskets' around other cell bodies	Cortex, cerebellum, hippocampus	Powerful inhibition of nearby neurons
Chandelier cells	Axon terminals resemble candelabra	Cerebral cortex	Inhibition at axon initial segment

Excitatory vs. Inhibitory Neurons

Excitatory neurons:

Release glutamate as primary neurotransmitter
Constitute ~80% of cortical neurons
Form long-range projections between brain areas
Pyramidal cells are the prototypical excitatory neurons

Inhibitory neurons:

Release GABA as primary neurotransmitter
Constitute ~20% of cortical neurons
Primarily local connections (though some project long distances)
Critical for network stability, rhythm generation, and computation

The balance between excitation and inhibition (E/I balance) is crucial for proper brain function. Disrupted E/I balance is implicated in disorders from epilepsy to autism.

Modulatory Neurons

Modulatory neurons release neuromodulators (dopamine, serotonin, norepinephrine, acetylcholine) that don't directly trigger action potentials but alter how circuits respond to other inputs:

Dopaminergic neurons: Originate in substantia nigra and VTA; critical for learning, motivation, and reward
Serotonergic neurons: Originate in raphe nuclei; influence mood, sleep, and anxiety
Noradrenergic neurons: Originate in locus coeruleus; modulate attention and arousal
Cholinergic neurons: Various origins; important for attention, memory, and neuromuscular control

These modulatory systems provide global signals that adjust learning rates, attention, and arousal—suggesting biological precedents for concepts like learning rate schedules and attention mechanisms in artificial networks.

From Biology to Artificial Networks

With this understanding of biological neurons, we can now appreciate what artificial neural networks capture and what they abstract away. This mapping is crucial for understanding both the power and the limitations of current deep learning approaches.

What Artificial Neurons Capture

Preserved biological features:

Weighted summation of inputs: The graded potential integration across dendrites and soma → Σ wᵢxᵢ
Threshold-based activation: The action potential threshold → activation function
Modifiable connection strengths: Synaptic plasticity → weight updates during training
Distributed representations: Population coding of information → hidden layer activations
Hierarchical processing: Multi-level neural pathways → deep architectures with multiple layers
Specialization through learning: Neurons develop selectivity for particular inputs → learned feature detectors

Captured in Artificial NNs

•Weighted input summation
•Nonlinear activation
•Adjustable connection strengths
•Multi-layer hierarchies
•Distributed representations
•Learning from examples

Abstracted Away

•Spike timing dynamics
•Dendritic computation
•Neuromodulation
•Metabolic constraints
•Stochastic neurotransmission
•Continuous-time dynamics

What Artificial Neurons Abstract Away

Simplified or ignored biological features:

Temporal dynamics: Real neurons are continuous-time dynamical systems with rich temporal structure; artificial neurons are typically computed instantaneously
Spiking behavior: Action potentials encode information in spike timing; artificial neurons produce continuous activations (rate-code assumption)
Dendritic computation: Biological dendrites perform local nonlinear computations; artificial neurons have a single integration point
Diverse neuron types: The brain has hundreds of distinct cell types; artificial networks typically use uniform units
Neuromodulation: Global signals like dopamine and serotonin modulate computation; artificial networks lack equivalent mechanisms (though attention approximates some functions)
Energy constraints: Biological neurons are energy-efficient, sparse, and event-driven; most artificial networks are dense and energy-intensive
Local learning rules: Synaptic plasticity uses only locally available information; backpropagation requires non-local gradient information

The Power of Abstraction

Despite these simplifications, artificial neural networks have achieved remarkable success. This suggests that the core computational principles—weighted summation, nonlinear activation, learned representations, and hierarchical processing—may be more important than the biological details.

However, the biological features we've abstracted away may hold keys to:

Energy-efficient computation (neuromorphic computing)
Better temporal processing (spiking neural networks)
More robust learning (local learning rules)
Improved generalization (neuromodulation, attention)

Researchers continue to draw inspiration from neuroscience to improve artificial networks.

Page Complete

You now have a thorough understanding of biological neurons—their structure, electrochemical signaling, synaptic transmission, plasticity, and information coding. This foundation is essential for understanding why artificial neurons are designed the way they are. In the next page, we'll see how these biological insights were distilled into mathematical models of artificial neurons.

1 / 5

Loading learning content...

Machine LearningNeural Network Foundations

Biological Inspiration

LevelIntermediate

Duration90 mins

TopicNeural Network Foundations

1 / 5

Biological Neurons

Nature's Computing Machinery

What You Will Learn

The Neuron Doctrine

Historical Significance

The key principles of the Neuron Doctrine:

Structural Independence: Neurons are discrete anatomical units with distinct boundaries, not fused into a continuous network
Functional Independence: Each neuron operates as an independent information-processing unit
Connectivity Through Synapses: Neurons communicate through specialized junctions called synapses, where information passes from one neuron to another
Directional Signal Flow: In most cases, information flows in one direction—from dendrites to axon terminals (though we now know there are exceptions)

These principles directly inform the design of artificial neural networks, where we model neurons as discrete units connected through weighted edges that transmit signals in a specified direction.

Anatomy of a Biological Neuron

A typical neuron consists of three main anatomical regions, each playing a distinct role in neural computation:

The Cell Body (Soma)

The Dendrites

Key properties of dendrites:

Extensive branching: A single neuron may have thousands of dendritic branches, allowing it to receive inputs from many other neurons
Dendritic spines: Small protrusions on dendrites where most excitatory synapses form
Passive signal propagation: Unlike axons, dendrites typically propagate signals passively (with signal decay), though active dendritic computation has been discovered
Surface area maximization: The branching structure maximizes the surface area available for receiving synaptic inputs

Comparison of Major Neuronal Structures
Structure	Primary Function	Signal Type	Typical Size
Dendrites	Receive input signals	Graded potentials (passive)	Up to 2mm total length
Soma (Cell Body)	Integrate signals, cell maintenance	Integration zone	10-100 μm diameter
Axon Hillock	Action potential initiation	Threshold detection	~1 μm
Axon	Transmit output signal	Action potentials (digital)	1 μm to 1+ meter
Axon Terminals	Release neurotransmitters	Chemical transmission	1-5 μm

The Axon

Critical axon properties:

Signal regeneration: Unlike passive dendritic propagation, axons actively regenerate signals using voltage-gated ion channels, allowing transmission over long distances without decay
Myelination: Many axons are wrapped in myelin—a fatty insulating sheath produced by glial cells—that dramatically increases signal transmission speed through saltatory conduction
All-or-nothing transmission: Axons carry action potentials, which are stereotyped electrical pulses that either fire fully or not at all
Extreme length variation: Axons can range from less than a millimeter (in local interneurons) to over a meter (motor neurons reaching from the spinal cord to the toes)

The Axon Terminals (Synaptic Boutons)

When an action potential reaches an axon terminal:

Voltage-gated calcium channels open
Calcium influx triggers vesicle fusion with the membrane
Neurotransmitters are released into the synaptic cleft
Neurotransmitters bind to receptors on the postsynaptic neuron
The postsynaptic neuron's membrane potential changes

Mapping to Artificial Neurons

Electrochemical Signaling

The Resting Membrane Potential

At rest, a neuron maintains a voltage difference of approximately -70 mV across its membrane (inside negative relative to outside). This resting potential is established and maintained by:

The sodium-potassium pump (Na⁺/K⁺-ATPase): This active transport protein uses ATP to pump 3 Na⁺ ions out and 2 K⁺ ions in, creating concentration gradients
Ion channel selectivity: The membrane at rest is more permeable to K⁺ than Na⁺, so potassium tends to leak out, making the inside more negative
Electrostatic forces: The resulting charge separation creates an electrical gradient that eventually balances the concentration gradient

Graded Potentials: Analog Input Signals

Properties of graded potentials:

Amplitude varies continuously with stimulus strength (unlike all-or-nothing action potentials)
Decremental propagation: Signal strength decreases with distance from the source (passive spread)
Temporal and spatial summation: Multiple graded potentials can add together
- Temporal summation: Rapid successive inputs at the same location combine
- Spatial summation: Simultaneous inputs at different locations combine
Bidirectional: Can be either depolarizing (making membrane potential less negative, excitatory) or hyperpolarizing (more negative, inhibitory)

Excitatory postsynaptic potentials (EPSPs): Depolarize the membrane toward the threshold, increasing the probability of firing

Inhibitory postsynaptic potentials (IPSPs): Hyperpolarize the membrane away from threshold, decreasing firing probability

The Integration Function

The Action Potential: Digital Output Signal

The action potential sequence:

Threshold reached: Membrane potential at axon hillock reaches ~-55 mV
Rapid depolarization: Voltage-gated Na⁺ channels open → Na⁺ rushes in → membrane potential shoots to ~+30 mV (within 1 ms)
Repolarization: Na⁺ channels inactivate; voltage-gated K⁺ channels open → K⁺ rushes out → membrane potential returns toward rest
Hyperpolarization: K⁺ channels close slowly → membrane briefly overshoots to ~-80 mV (refractory period)
Return to rest: Na⁺/K⁺ pumps restore resting potential; neuron ready to fire again

Key properties of action potentials:

All-or-nothing: Same amplitude regardless of stimulus strength above threshold
Self-propagating: Active regeneration allows travel over long distances without decay
Refractory period: Brief period after firing when the neuron cannot fire again, limiting maximum firing rate to ~500-1000 Hz
Information in timing: Since amplitude is fixed, information is encoded in firing rate and precise spike timing

Comparison: Graded Potentials vs Action Potentials
Property	Graded Potentials	Action Potentials
Amplitude	Variable (proportional to input)	Fixed (~100 mV total swing)
Propagation	Passive, decremental	Active, regenerative
Distance	Short (millimeters)	Long (up to meters)
Summation	Yes (temporal and spatial)	No (all-or-nothing)
Direction	Bidirectional	Unidirectional (axon → terminals)
Neural analog	Weighted sum input	Thresholded output
Artificial analog	Σ wᵢxᵢ	Activation function output

Synaptic Transmission

The Chemical Synapse

Most synapses in the brain are chemical synapses, where information is transmitted via neurotransmitter molecules. The synapse consists of three parts:

Presynaptic terminal: The axon terminal of the sending neuron, containing vesicles filled with neurotransmitters
Synaptic cleft: A 20-40 nanometer gap between neurons filled with extracellular fluid
Postsynaptic membrane: The receiving neuron's membrane (usually a dendritic spine), containing neurotransmitter receptors

The synaptic transmission process:

Action potential arrives at presynaptic terminal
Voltage-gated Ca²⁺ channels open; calcium enters
Calcium triggers vesicle fusion with the membrane
Neurotransmitters are released into the synaptic cleft
Neurotransmitters diffuse across the cleft (~0.1 ms)
Neurotransmitters bind to postsynaptic receptors
Receptors trigger changes in postsynaptic membrane potential
Neurotransmitters are cleared (reuptake, enzymatic breakdown, or diffusion)

Neurotransmitters and Receptor Types

Major neurotransmitters in the brain:

Glutamate: The primary excitatory neurotransmitter. Binds to AMPA and NMDA receptors. Responsible for most fast excitatory transmission.
GABA (γ-aminobutyric acid): The primary inhibitory neurotransmitter. Binds to GABA receptors. Critical for preventing runaway excitation.
Dopamine: Involved in reward, motivation, and learning. Central to reinforcement learning circuits.
Acetylcholine: Important for attention and memory. Used at neuromuscular junctions.
Serotonin: Modulates mood, sleep, and various cognitive functions.

Receptor types:

Ionotropic receptors: Fast-acting. Neurotransmitter binding directly opens an ion channel. Response in milliseconds.
Metabotropic receptors: Slower but longer-lasting. Neurotransmitter binding triggers intracellular signaling cascades. Response over seconds to minutes.

Electrical Synapses (Gap Junctions)

Extremely fast: No chemical delay (~0.1 ms vs 0.5-2 ms for chemical synapses)
Bidirectional: Current can flow either direction
Limited plasticity: Less capacity for learning compared to chemical synapses
Used for synchronization: Common in circuits requiring rapid, coordinated activity

Synaptic Weights in Artificial Networks

Synaptic Plasticity: The Basis of Learning

Hebbian Learning: "Fire Together, Wire Together"

In 1949, psychologist Donald Hebb proposed a theory of learning that remains central to both neuroscience and machine learning:

"When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased."

More succinctly: "Neurons that fire together, wire together."

Long-Term Potentiation (LTP)

Key properties of LTP:

Input specificity: Only activated synapses are strengthened, not all synapses on a neuron
Cooperativity: Weak stimulation of few inputs is insufficient; many inputs must be active together
Associativity: A weak input can be strengthened if active simultaneously with a strong input
Persistence: Effects can last for hours, days, or even longer

Molecular mechanism:

Strong, repeated presynaptic activity releases glutamate
Glutamate activates AMPA receptors → fast depolarization
Depolarization relieves Mg²⁺ block of NMDA receptors
NMDA receptors open → Ca²⁺ enters the postsynaptic cell
Calcium activates kinases (CaMKII, PKC)
Kinases phosphorylate AMPA receptors → increased conductance
New AMPA receptors are inserted into the membrane
Long-term: Gene expression changes → structural modifications

Connection to Backpropagation

Long-Term Depression (LTD)

LTD is the opposite of LTP—a long-lasting decrease in synaptic strength. It's equally important for learning, as it allows the brain to:

Prevent saturation: Without LTD, all synapses would eventually max out
Forget outdated associations: Clear old patterns to make room for new ones
Refine representations: Pruning weak connections sharpen learned patterns

Induction of LTD:

Low-frequency stimulation (1 Hz for 10-15 minutes)
Low, prolonged postsynaptic calcium elevation (vs. high, brief for LTP)
Activation of phosphatases instead of kinases
Removal of AMPA receptors from the membrane

Spike-Timing-Dependent Plasticity (STDP)

A more refined view of Hebbian learning emerged from STDP experiments in the 1990s:

If the presynaptic spike occurs before the postsynaptic spike (within ~20 ms), the synapse is strengthened (causal relationship → potentiation)
If the presynaptic spike occurs after the postsynaptic spike, the synapse is weakened (acausal relationship → depression)

This timing dependence implements a form of causality detection—the synapse asks "Did the presynaptic neuron help cause the postsynaptic neuron to fire?" If yes, strengthen; if no, weaken.

The STDP learning rule can be approximated as:

Δw = η × (A₊ × e^(-Δt/τ₊)) if Δt > 0 (pre before post) Δw = -η × (A₋ × e^(Δt/τ₋)) if Δt < 0 (pre after post)

Where Δt = t_post - t_pre, and A₊, A₋, τ₊, τ₋ are parameters controlling the magnitude and time constants of potentiation and depression.

Neural Coding: How Information is Represented

Rate Coding

The most straightforward coding scheme is rate coding, where information is encoded in the firing rate of neurons—the number of action potentials per unit time.

Evidence for rate coding:

Sensory neurons: firing rate often increases monotonically with stimulus intensity
Motor neurons: firing rate controls muscle contraction strength
Many neurons show tuning curves where firing rate peaks for preferred stimuli

Mathematical representation:

r = f(I)

Where r is the firing rate and f(I) is some function of the input I. This directly corresponds to an artificial neuron's output activation.

Limitations of rate coding:

Slow: Requires time to estimate the rate (typically 50-100 ms)
Cannot explain rapid processing: Humans can recognize images in ~100 ms, but this allows few spikes per neuron
Ignores temporal structure: Treats spike trains as a simple count

Temporal Coding

Temporal coding hypothesizes that precise spike timing carries information beyond just the firing rate.

Forms of temporal coding:

Latency coding: Information in the time to first spike after stimulus onset (observed in visual cortex, olfactory system)
Phase coding: Spike timing relative to ongoing oscillations (observed in hippocampus during navigation)
Synchrony coding: Information in which neurons fire together (observed in sensory binding)
Temporal patterns: Specific sequences of interspike intervals (observed in songbird communication)

Evidence for temporal coding:

Neurons can discriminate timing differences as small as 1-2 ms
Neural synchrony correlates with attention and binding
Some tasks require temporal precision beyond what rate coding allows

Population Coding

Population coding recognizes that single neurons are noisy and limited; information is more reliably represented by populations of neurons.

Examples:

Place cells in hippocampus: Each neuron has a preferred location; the animal's position is encoded by which neurons are active and how strongly
Motor cortex population vectors: Direction of arm movement is encoded by a weighted sum of many neurons' preferred directions
Distributed representations: Concepts, objects, and categories are represented by patterns of activity across many neurons, not single 'grandmother cells'

This perspective directly informs artificial neural network design, where we use layers of many neurons to create distributed representations.

Artificial Neural Network Simplification

Types of Neurons in the Nervous System

Classification by Function

Sensory neurons (afferent neurons):

Transmit information from sensory receptors to the central nervous system
Examples: photoreceptors in the retina, hair cells in the cochlea, mechanoreceptors in skin
Specialized input structures matched to specific stimulus types

Motor neurons (efferent neurons):

Transmit commands from the central nervous system to muscles and glands
Large cell bodies and long axons reaching peripheral targets
Final common pathway for all voluntary movement

Interneurons:

Connect neurons within the central nervous system
Comprise the vast majority of neurons in the brain
Perform information processing, comparison, and integration
Range from local (within a brain region) to projection (between regions)

Major Neuron Types by Morphology
Type	Structure	Location	Function
Pyramidal cells	Large, pyramid-shaped soma; long apical dendrite	Cerebral cortex, hippocampus	Primary excitatory neurons; cortical computation
Purkinje cells	Very large; elaborate dendritic tree in single plane	Cerebellar cortex	Motor learning and coordination
Granule cells	Very small; few dendrites	Cerebellum, hippocampus	Pattern separation; most numerous neuron type
Stellate cells	Star-shaped dendritic tree	Cortex, cerebellum	Local inhibition
Basket cells	Axons form 'baskets' around other cell bodies	Cortex, cerebellum, hippocampus	Powerful inhibition of nearby neurons
Chandelier cells	Axon terminals resemble candelabra	Cerebral cortex	Inhibition at axon initial segment

Excitatory vs. Inhibitory Neurons

Excitatory neurons:

Release glutamate as primary neurotransmitter
Constitute ~80% of cortical neurons
Form long-range projections between brain areas
Pyramidal cells are the prototypical excitatory neurons

Inhibitory neurons:

Release GABA as primary neurotransmitter
Constitute ~20% of cortical neurons
Primarily local connections (though some project long distances)
Critical for network stability, rhythm generation, and computation

The balance between excitation and inhibition (E/I balance) is crucial for proper brain function. Disrupted E/I balance is implicated in disorders from epilepsy to autism.

Modulatory Neurons

Modulatory neurons release neuromodulators (dopamine, serotonin, norepinephrine, acetylcholine) that don't directly trigger action potentials but alter how circuits respond to other inputs:

Dopaminergic neurons: Originate in substantia nigra and VTA; critical for learning, motivation, and reward
Serotonergic neurons: Originate in raphe nuclei; influence mood, sleep, and anxiety
Noradrenergic neurons: Originate in locus coeruleus; modulate attention and arousal
Cholinergic neurons: Various origins; important for attention, memory, and neuromuscular control

From Biology to Artificial Networks

What Artificial Neurons Capture

Preserved biological features:

Weighted summation of inputs: The graded potential integration across dendrites and soma → Σ wᵢxᵢ
Threshold-based activation: The action potential threshold → activation function
Modifiable connection strengths: Synaptic plasticity → weight updates during training
Distributed representations: Population coding of information → hidden layer activations
Hierarchical processing: Multi-level neural pathways → deep architectures with multiple layers
Specialization through learning: Neurons develop selectivity for particular inputs → learned feature detectors

Captured in Artificial NNs

•Weighted input summation
•Nonlinear activation
•Adjustable connection strengths
•Multi-layer hierarchies
•Distributed representations
•Learning from examples

Abstracted Away

•Spike timing dynamics
•Dendritic computation
•Neuromodulation
•Metabolic constraints
•Stochastic neurotransmission
•Continuous-time dynamics

What Artificial Neurons Abstract Away

Simplified or ignored biological features:

Temporal dynamics: Real neurons are continuous-time dynamical systems with rich temporal structure; artificial neurons are typically computed instantaneously
Spiking behavior: Action potentials encode information in spike timing; artificial neurons produce continuous activations (rate-code assumption)
Dendritic computation: Biological dendrites perform local nonlinear computations; artificial neurons have a single integration point
Diverse neuron types: The brain has hundreds of distinct cell types; artificial networks typically use uniform units
Neuromodulation: Global signals like dopamine and serotonin modulate computation; artificial networks lack equivalent mechanisms (though attention approximates some functions)
Energy constraints: Biological neurons are energy-efficient, sparse, and event-driven; most artificial networks are dense and energy-intensive
Local learning rules: Synaptic plasticity uses only locally available information; backpropagation requires non-local gradient information

The Power of Abstraction

However, the biological features we've abstracted away may hold keys to:

Energy-efficient computation (neuromorphic computing)
Better temporal processing (spiking neural networks)
More robust learning (local learning rules)
Improved generalization (neuromodulation, attention)

Researchers continue to draw inspiration from neuroscience to improve artificial networks.

Page Complete

1 / 5