Loading learning content...
In our journey through probabilistic graphical models, we've explored Bayesian Networks—directed acyclic graphs that elegantly capture cause-and-effect relationships. But many real-world phenomena exhibit symmetric dependencies that resist directional interpretation. Consider the pixels in an image: neighboring pixels influence each other, but there's no inherent 'cause' flowing from one pixel to another. Similarly, atoms in a crystal lattice interact symmetrically, and words in a document exhibit mutual associations rather than causal chains.
These scenarios demand a different representational framework: Undirected Graphical Models, also known as Markov Random Fields (MRFs) or Markov Networks. These models define probability distributions through symmetric relationships between variables, opening doors to applications ranging from computer vision to statistical physics to natural language processing.
By the end of this page, you will understand the fundamental structure of undirected graphical models, master the concepts of cliques and neighborhoods, learn how undirected graphs encode conditional independence, and appreciate the key differences between directed and undirected models. This foundation is essential for understanding potential functions, the partition function, and the celebrated Hammersley-Clifford theorem.
An undirected graphical model represents a probability distribution using an undirected graph $G = (V, E)$, where:
Unlike directed graphs where edges have arrows indicating direction, undirected edges are symmetric: if node $A$ is connected to node $B$, then $B$ is equally connected to $A$. This symmetry fundamentally changes how we interpret relationships and factorize probability distributions.
Undirected graphical models go by many names in the literature: Markov Random Fields (MRFs), Markov Networks, and Undirected Graphical Models. In physics, they're often called Gibbs Random Fields due to their connection to statistical mechanics. All these terms refer to the same mathematical framework.
Formal Definition:
Let $\mathbf{X} = (X_1, X_2, \ldots, X_n)$ be a set of random variables. An undirected graphical model over $\mathbf{X}$ consists of:
The graph structure encodes which variables directly interact, while the potential functions quantify the strength and nature of those interactions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
import numpy as npimport networkx as nximport matplotlib.pyplot as pltfrom typing import Dict, List, Set, Tuple class UndirectedGraphicalModel: """ A basic implementation of an undirected graphical model structure. This class focuses on the graph topology and neighborhood relations. """ def __init__(self, num_nodes: int): """ Initialize an undirected graphical model with isolated nodes. Args: num_nodes: Number of random variables (nodes) in the model """ self.num_nodes = num_nodes self.graph = nx.Graph() self.graph.add_nodes_from(range(num_nodes)) self.node_labels: Dict[int, str] = {i: f"X_{i}" for i in range(num_nodes)} def add_edge(self, node_i: int, node_j: int): """ Add an undirected edge between two nodes. This represents a direct dependency between the variables. Args: node_i: First node index node_j: Second node index """ if node_i == node_j: raise ValueError("Self-loops are not allowed in standard MRFs") self.graph.add_edge(node_i, node_j) def neighbors(self, node: int) -> Set[int]: """ Get the neighbors of a node (its Markov blanket in undirected graphs). In undirected graphs, the Markov blanket is simply the set of directly connected nodes—much simpler than in Bayesian networks! Args: node: Node index Returns: Set of neighbor node indices """ return set(self.graph.neighbors(node)) def is_neighbor(self, node_i: int, node_j: int) -> bool: """Check if two nodes are directly connected.""" return self.graph.has_edge(node_i, node_j) def degree(self, node: int) -> int: """Get the degree (number of neighbors) of a node.""" return self.graph.degree(node) def get_all_edges(self) -> List[Tuple[int, int]]: """Return all edges in the graph.""" return list(self.graph.edges()) def visualize(self, title: str = "Undirected Graphical Model"): """Visualize the graph structure.""" plt.figure(figsize=(10, 8)) pos = nx.spring_layout(self.graph, seed=42) nx.draw_networkx_nodes( self.graph, pos, node_color='lightblue', node_size=700, alpha=0.9 ) nx.draw_networkx_edges( self.graph, pos, edge_color='gray', width=2, alpha=0.7 ) nx.draw_networkx_labels( self.graph, pos, labels=self.node_labels, font_size=12, font_weight='bold' ) plt.title(title, fontsize=14) plt.axis('off') plt.tight_layout() return plt.gcf() # Example: Create a simple 4-node undirected graphical model# This represents the classic "diamond" or "square" structuremodel = UndirectedGraphicalModel(num_nodes=4)model.node_labels = {0: "A", 1: "B", 2: "C", 3: "D"} # Add edges to form a square with one diagonalmodel.add_edge(0, 1) # A -- Bmodel.add_edge(1, 2) # B -- C model.add_edge(2, 3) # C -- Dmodel.add_edge(3, 0) # D -- Amodel.add_edge(0, 2) # A -- C (diagonal) print("Graph Structure Analysis:")print("=" * 40)for node in range(4): name = model.node_labels[node] neighbors = [model.node_labels[n] for n in model.neighbors(node)] print(f"Node {name}: neighbors = {neighbors}, degree = {model.degree(node)}") print(f"\nTotal edges: {len(model.get_all_edges())}")print(f"Edges: {[(model.node_labels[i], model.node_labels[j]) for i, j in model.get_all_edges()]}")In undirected graphical models, the concept of neighborhood is fundamental. The neighborhood of a node $X_i$, denoted $\text{ne}(i)$ or $\mathcal{N}(i)$, is the set of all nodes directly connected to $X_i$ by an edge:
$$\text{ne}(i) = {j : (i, j) \in E}$$
This simple definition has profound implications for probabilistic reasoning.
In Bayesian networks, the Markov blanket includes parents, children, and co-parents (spouses). In undirected models, the Markov blanket is simply the neighborhood—the directly connected nodes. This simplicity is one of the elegant properties of undirected models.
The Local Markov Property:
The neighborhood defines the Markov blanket in undirected models. Given its neighbors, a variable is conditionally independent of all other variables in the graph:
$$X_i \perp!!!\perp {X_j : j \notin \text{ne}(i) \cup {i}} \mid {X_k : k \in \text{ne}(i)}$$
In plain language: Once you know a node's neighbors, knowing any other nodes in the graph provides no additional information about that node.
This property is called the Local Markov Property and is fundamental to inference in MRFs.
| Aspect | Bayesian Network (Directed) | Markov Random Field (Undirected) |
|---|---|---|
| Markov Blanket Components | Parents + Children + Spouses | Neighbors only |
| Blanket Size Calculation | More complex, depends on graph structure | Simply the degree of the node |
| Intuition | Shield from causal influences | Shield from local interactions |
| Finding Blanket | Must trace parent-child relationships | Just look at adjacent edges |
Boundary and Separator Sets:
Beyond individual neighborhoods, we can define important set-theoretic concepts:
Boundary of a set $A \subseteq V$: The set of nodes outside $A$ that are neighbors to at least one node in $A$: $$\text{bd}(A) = {j \notin A : \exists i \in A, (i,j) \in E}$$
Separator set: A set $S$ that separates sets $A$ and $B$ if every path from $A$ to $B$ passes through $S$.
These concepts generalize the local Markov property to sets of variables and are crucial for understanding conditional independence in complex graphs.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137
from typing import Set, FrozenSet class MarkovBlanketAnalyzer: """ Analyzes Markov blankets and separation properties in undirected graphs. """ def __init__(self, model: UndirectedGraphicalModel): self.model = model self.graph = model.graph def markov_blanket(self, node: int) -> Set[int]: """ Compute the Markov blanket of a node. In undirected graphs, this is simply the neighborhood. Args: node: Node index Returns: Set of nodes in the Markov blanket """ return self.model.neighbors(node) def boundary(self, node_set: Set[int]) -> Set[int]: """ Compute the boundary of a set of nodes. The boundary consists of all nodes NOT in the set that are adjacent to at least one node IN the set. Args: node_set: Set of node indices Returns: Set of boundary nodes """ boundary_nodes = set() for node in node_set: for neighbor in self.model.neighbors(node): if neighbor not in node_set: boundary_nodes.add(neighbor) return boundary_nodes def is_separator(self, separator: Set[int], set_a: Set[int], set_b: Set[int]) -> bool: """ Check if a separator set separates two sets of nodes. Uses graph-theoretic path analysis: S separates A from B if removing S disconnects all paths between A and B. Args: separator: Proposed separator set set_a: First set of nodes set_b: Second set of nodes Returns: True if separator properly separates set_a from set_b """ # Create subgraph without separator nodes remaining_nodes = set(range(self.model.num_nodes)) - separator subgraph = self.graph.subgraph(remaining_nodes) # Check if any node in set_a can reach any node in set_b for a in set_a: if a in separator: continue for b in set_b: if b in separator: continue if nx.has_path(subgraph, a, b): return False return True def conditional_independence_test(self, x: int, y: int, conditioning_set: Set[int]) -> bool: """ Test if X_x ⊥⊥ X_y | X_conditioning_set based on graph structure. In undirected graphs, conditional independence holds if the conditioning set separates the two nodes (blocks all paths). Args: x: First node y: Second node conditioning_set: Conditioning set Returns: True if the conditional independence holds graphically """ return self.is_separator(conditioning_set, {x}, {y}) # Demonstrate Markov blanket and separation analysismodel = UndirectedGraphicalModel(num_nodes=6)model.node_labels = {0: "A", 1: "B", 2: "C", 3: "D", 4: "E", 5: "F"} # Create a chain with branches:# E# |# A - B - C - D# |# Fmodel.add_edge(0, 1) # A - Bmodel.add_edge(1, 2) # B - Cmodel.add_edge(2, 3) # C - Dmodel.add_edge(1, 4) # B - Emodel.add_edge(1, 5) # B - F analyzer = MarkovBlanketAnalyzer(model) print("Markov Blanket Analysis:")print("=" * 50)for node in range(6): name = model.node_labels[node] blanket = [model.node_labels[n] for n in analyzer.markov_blanket(node)] print(f"Markov blanket of {name}: {blanket}") print("\nConditional Independence Analysis:")print("=" * 50) # Test: Is A ⊥⊥ D | {B, C}?ci_test = analyzer.conditional_independence_test(0, 3, {1, 2})print(f"A ⊥⊥ D | {{B, C}}: {ci_test}") # Should be True # Test: Is A ⊥⊥ D | {B}?ci_test = analyzer.conditional_independence_test(0, 3, {1})print(f"A ⊥⊥ D | {{B}}: {ci_test}") # Should be True (path A-B-C-D blocked at B) # Test: Is E ⊥⊥ F | {B}?ci_test = analyzer.conditional_independence_test(4, 5, {1})print(f"E ⊥⊥ F | {{B}}: {ci_test}") # Should be True # Test: Is A ⊥⊥ E | {C}? ci_test = analyzer.conditional_independence_test(0, 4, {2})print(f"A ⊥⊥ E | {{C}}: {ci_test}") # Should be False (path through B)Cliques are the fundamental structural units in undirected graphical models. Understanding cliques is essential because they determine how we factorize probability distributions in MRFs.
A clique in an undirected graph is a subset of nodes where every pair of nodes is connected by an edge. In other words, a clique is a complete subgraph—every node in the clique is a neighbor of every other node in the clique.
Types of Cliques:
For example, in a triangle graph (A-B, B-C, A-C):
Why Maximal Cliques Matter:
In MRF representation, we define potential functions over cliques. Although we could define potentials over any clique, the Hammersley-Clifford theorem (covered in a later page) tells us that the maximal cliques are sufficient for representing the joint distribution completely.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134
from itertools import combinationsfrom typing import List, Tuple class CliqueAnalyzer: """ Analyzes clique structure in undirected graphs. Cliques are fundamental to understanding MRF factorization. """ def __init__(self, model: UndirectedGraphicalModel): self.model = model self.graph = model.graph def is_clique(self, nodes: Set[int]) -> bool: """ Check if a set of nodes forms a clique. A clique requires every pair of nodes to be connected. Args: nodes: Set of node indices Returns: True if nodes form a clique """ node_list = list(nodes) for i in range(len(node_list)): for j in range(i + 1, len(node_list)): if not self.model.is_neighbor(node_list[i], node_list[j]): return False return True def find_all_maximal_cliques(self) -> List[FrozenSet[int]]: """ Find all maximal cliques using the Bron-Kerbosch algorithm. A maximal clique cannot be extended by adding any other vertex. This is the standard algorithm for enumerating all maximal cliques. Returns: List of all maximal cliques (as frozen sets) """ return [frozenset(c) for c in nx.find_cliques(self.graph)] def clique_graph(self) -> nx.Graph: """ Build the clique graph (intersection graph of maximal cliques). In the clique graph: - Each maximal clique becomes a node - Two clique-nodes are connected if they share variables - Edge weights can represent the size of intersection This is useful for understanding the junction tree structure. Returns: NetworkX graph representing the clique graph """ maximal_cliques = self.find_all_maximal_cliques() clique_graph = nx.Graph() # Add clique nodes for i, clique in enumerate(maximal_cliques): clique_graph.add_node(i, members=clique) # Add edges between cliques that share variables for i in range(len(maximal_cliques)): for j in range(i + 1, len(maximal_cliques)): intersection = maximal_cliques[i] & maximal_cliques[j] if intersection: clique_graph.add_edge(i, j, separator=intersection, weight=len(intersection)) return clique_graph def analyze_clique_structure(self) -> dict: """ Provide comprehensive analysis of clique structure. Returns: Dictionary with clique analysis results """ maximal_cliques = self.find_all_maximal_cliques() analysis = { 'num_maximal_cliques': len(maximal_cliques), 'maximal_cliques': maximal_cliques, 'largest_clique_size': max(len(c) for c in maximal_cliques) if maximal_cliques else 0, 'clique_sizes': [len(c) for c in maximal_cliques], 'treewidth_upper_bound': max(len(c) for c in maximal_cliques) - 1 if maximal_cliques else 0 } return analysis # Example: Analyze cliques in a more complex graphmodel = UndirectedGraphicalModel(num_nodes=6)model.node_labels = {0: "A", 1: "B", 2: "C", 3: "D", 4: "E", 5: "F"} # Create a graph with interesting clique structure:# A - B - C# | X | |# D - E - F# Where "X" means A-B-D-E form a clique model.add_edge(0, 1) # A - Bmodel.add_edge(1, 2) # B - Cmodel.add_edge(0, 3) # A - Dmodel.add_edge(1, 4) # B - Emodel.add_edge(2, 5) # C - Fmodel.add_edge(3, 4) # D - Emodel.add_edge(4, 5) # E - F# Add edges to form cliquesmodel.add_edge(0, 4) # A - E (creates larger clique)model.add_edge(1, 3) # B - D (creates larger clique) analyzer = CliqueAnalyzer(model)analysis = analyzer.analyze_clique_structure() print("Clique Structure Analysis:")print("=" * 50)print(f"Number of maximal cliques: {analysis['num_maximal_cliques']}")print(f"Largest clique size: {analysis['largest_clique_size']}")print(f"Treewidth upper bound: {analysis['treewidth_upper_bound']}")print("\nMaximal cliques:")for i, clique in enumerate(analysis['maximal_cliques']): clique_names = {model.node_labels[n] for n in clique} print(f" Clique {i+1}: {clique_names}") # Check specific setsprint("\nClique verification:")print(f"Is {{A, B, D, E}} a clique? {analyzer.is_clique({0, 1, 3, 4})}")print(f"Is {{A, B, C}} a clique? {analyzer.is_clique({0, 1, 2})}")print(f"Is {{C, E, F}} a clique? {analyzer.is_clique({2, 4, 5})}")Finding all maximal cliques is computationally expensive. In the worst case, a graph can have exponentially many maximal cliques (e.g., the Moon-Moser graphs). For dense graphs, clique enumeration can become intractable. This has direct implications for inference in MRFs with complex structure.
While the Local Markov Property describes independence for individual nodes, the Global Markov Property provides a powerful characterization of conditional independence between arbitrary sets of variables.
For disjoint sets of nodes A, B, and S: if S separates A from B in the graph (i.e., removing S disconnects all paths between A and B), then A and B are conditionally independent given S:
$$A \perp!!!\perp B \mid S$$
The Three Markov Properties:
Undirected graphical models can be characterized by three equivalent properties (equivalent under mild positivity conditions):
Pairwise Markov Property: Non-adjacent nodes are conditionally independent given all other nodes: $$X_i \perp!!!\perp X_j \mid \mathbf{X}_{V \setminus {i,j}} \text{ whenever } (i,j) \notin E$$
Local Markov Property: Each node is conditionally independent of non-neighbors given its neighbors: $$X_i \perp!!!\perp \mathbf{X}{V \setminus \text{ne}(i) \setminus {i}} \mid \mathbf{X}{\text{ne}(i)}$$
Global Markov Property: Separation in the graph implies conditional independence: $$A \perp!!!\perp B \mid S \text{ whenever } S \text{ separates } A \text{ from } B$$
These properties form a hierarchy: Global ⟹ Local ⟹ Pairwise. Under positivity (the distribution assigns positive probability to all configurations), all three are equivalent.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154
class MarkovPropertyAnalyzer: """ Analyzes and verifies Markov properties in undirected graphs. """ def __init__(self, model: UndirectedGraphicalModel): self.model = model self.graph = model.graph def check_pairwise_independence(self, i: int, j: int) -> bool: """ Check the pairwise Markov property for nodes i and j. Non-adjacent nodes should be conditionally independent given all other nodes. Args: i, j: Node indices Returns: True if pairwise property suggests independence """ # If adjacent, pairwise property doesn't apply if self.model.is_neighbor(i, j): return False # Not independent (directly connected) # Non-adjacent implies conditionally independent given rest return True def find_minimal_separator(self, source: int, target: int) -> Set[int]: """ Find a minimal separator between two nodes. A minimal separator is a smallest set S such that removing S from the graph disconnects source from target. Args: source: Source node target: Target node Returns: Minimal separator set (empty if nodes are adjacent) """ if self.model.is_neighbor(source, target): return set() # No separator exists for adjacent nodes # Use minimum vertex cut try: separator = nx.minimum_node_cut(self.graph, source, target) return set(separator) except nx.NetworkXError: # Nodes might be disconnected return set() def all_independence_statements(self) -> List[Tuple[int, int, Set[int]]]: """ Enumerate all pairwise conditional independence statements implied by the graph structure. Returns: List of (node_i, node_j, separator_set) tuples indicating X_i ⊥⊥ X_j | X_separator_set """ statements = [] n = self.model.num_nodes for i in range(n): for j in range(i + 1, n): if not self.model.is_neighbor(i, j): # Find the minimal separator separator = self.find_minimal_separator(i, j) statements.append((i, j, separator)) return statements def verify_separation(self, set_a: Set[int], set_b: Set[int], separator: Set[int]) -> dict: """ Verify if separator properly separates set_a from set_b. Provides detailed analysis of why separation holds or fails. Args: set_a: First set of nodes set_b: Second set of nodes separator: Proposed separator Returns: Dictionary with verification results """ # Build subgraph without separator remaining = set(range(self.model.num_nodes)) - separator subgraph = self.graph.subgraph(list(remaining)) # Find connected components components = list(nx.connected_components(subgraph)) # Check which components contain nodes from A and B a_in_remaining = set_a - separator b_in_remaining = set_b - separator a_components = set() b_components = set() for idx, component in enumerate(components): if a_in_remaining & component: a_components.add(idx) if b_in_remaining & component: b_components.add(idx) is_valid = len(a_components & b_components) == 0 return { 'separates': is_valid, 'num_components_after_removal': len(components), 'a_components': a_components, 'b_components': b_components, 'a_nodes_remaining': a_in_remaining, 'b_nodes_remaining': b_in_remaining, 'reason': 'A and B in different components' if is_valid else 'A and B share a component' } # Demonstrationmodel = UndirectedGraphicalModel(num_nodes=5)model.node_labels = {0: "A", 1: "B", 2: "C", 3: "D", 4: "E"} # Create a path: A - B - C - D - Emodel.add_edge(0, 1)model.add_edge(1, 2)model.add_edge(2, 3)model.add_edge(3, 4) analyzer = MarkovPropertyAnalyzer(model) print("Markov Property Analysis for Path Graph A-B-C-D-E:")print("=" * 55) # Find all independence statementsstatements = analyzer.all_independence_statements()print("\nConditional Independence Statements from Graph Structure:")for i, j, sep in statements: sep_names = {model.node_labels[s] for s in sep} print(f" {model.node_labels[i]} ⊥⊥ {model.node_labels[j]} | {sep_names}") # Verify specific separationprint("\nDetailed Separation Analysis:")result = analyzer.verify_separation({0}, {4}, {2}) # A ⊥⊥ E | Cprint(f"Does {{C}} separate {{A}} from {{E}}? {result['separates']}")print(f"Reason: {result['reason']}") result = analyzer.verify_separation({0}, {4}, {1}) # A ⊥⊥ E | Bprint(f"Does {{B}} separate {{A}} from {{E}}? {result['separates']}")Understanding when to use undirected versus directed graphical models is crucial for effective probabilistic modeling. Each framework has distinct strengths and represents different aspects of probabilistic relationships.
| Aspect | Bayesian Networks (Directed) | Markov Random Fields (Undirected) |
|---|---|---|
| Edge Interpretation | Causal or generative influence (parent → child) | Symmetric correlation or constraint |
| Factorization | Product of conditional probabilities: $\prod_i P(X_i \mid \text{Parents}(X_i))$ | Product of potential functions: $\frac{1}{Z}\prod_C \psi_C(X_C)$ |
| Normalization | Automatic (conditionals sum to 1) | Requires explicit partition function $Z$ |
| Cycles | Not allowed (must be DAG) | Cycles are permitted |
| Independence Test | D-separation (complex rules) | Graph separation (remove nodes, check connectivity) |
| Markov Blanket | Parents + Children + Spouses | Neighbors only |
| Parameter Interpretation | Directly as probabilities | As compatibility scores (not probabilities) |
| Typical Applications | Causal reasoning, generative models | Image segmentation, constraint satisfaction, physics |
Key Philosophical Difference:
The deepest distinction lies in how these models represent relationships:
Directed models are fundamentally generative: they describe a process by which data could be generated. Each variable is generated from its parents according to a conditional probability distribution.
Undirected models are fundamentally discriminative in structure: they describe compatibility constraints between variables. They don't specify which variable 'causes' which; instead, they specify which configurations of variables are more or less likely to co-occur.
Any Bayesian network can be converted to an undirected model through moralization: connect all parents of each node (marry the parents) and then drop edge directions. However, this conversion often loses some independence information—the undirected model may encode fewer independencies than the original directed model. The converse (MRF to Bayesian network) is also possible but may similarly lose some structure.
Undirected graphical models appear naturally in many domains. Understanding these applications helps build intuition for when MRF structure is appropriate.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
def create_image_grid_mrf(height: int, width: int) -> UndirectedGraphicalModel: """ Create a grid-structured MRF typical in image processing. Each pixel connects to its 4-connected neighbors (up, down, left, right). This is the fundamental structure for image segmentation, denoising, etc. Args: height: Image height in pixels width: Image width in pixels Returns: UndirectedGraphicalModel with grid structure """ num_pixels = height * width model = UndirectedGraphicalModel(num_pixels) # Create labels model.node_labels = { i: f"({i // width}, {i % width})" for i in range(num_pixels) } # Add 4-connectivity edges for row in range(height): for col in range(width): current = row * width + col # Right neighbor if col < width - 1: model.add_edge(current, current + 1) # Down neighbor if row < height - 1: model.add_edge(current, current + width) return model def analyze_grid_structure(model: UndirectedGraphicalModel, height: int, width: int): """Analyze properties of a grid MRF.""" analyzer = CliqueAnalyzer(model) print(f"Grid MRF Analysis ({height}x{width} = {height*width} nodes):") print("=" * 50) # In a 4-connected grid, maximal cliques are just edges! cliques = analyzer.find_all_maximal_cliques() print(f"Number of maximal cliques: {len(cliques)}") print(f"Clique sizes: {set(len(c) for c in cliques)}") # Count edge types horizontal_edges = (width - 1) * height vertical_edges = width * (height - 1) print(f"Horizontal edges: {horizontal_edges}") print(f"Vertical edges: {vertical_edges}") print(f"Total edges: {len(model.get_all_edges())}") # Degree analysis degrees = [model.degree(i) for i in range(height * width)] print(f"\nDegree distribution:") print(f" Corners (degree 2): {degrees.count(2)}") print(f" Edges (degree 3): {degrees.count(3)}") print(f" Interior (degree 4): {degrees.count(4)}") # Create and analyze a small grid MRFgrid_model = create_image_grid_mrf(4, 4)analyze_grid_structure(grid_model, 4, 4) # Demonstrate separation in gridprint("\nSeparation in Grid MRF:")mb_analyzer = MarkovBlanketAnalyzer(grid_model) # In a 4x4 grid, labeled 0-15:# 0 1 2 3# 4 5 6 7# 8 9 10 11# 12 13 14 15 # The middle column (1,5,9,13) and (2,6,10,14) should separate left from rightleft_side = {0, 4, 8, 12}right_side = {3, 7, 11, 15}separator = {1, 5, 9, 13, 2, 6, 10, 14} is_sep = mb_analyzer.is_separator(separator, left_side, right_side)print(f"Middle columns separate left edge from right edge: {is_sep}")The structure of an undirected graph has profound implications for how we represent probability distributions. While Bayesian networks factor distributions into products of conditional probabilities, MRFs require a different approach.
In directed models, the joint distribution factors as $P(X_1, \ldots, X_n) = \prod_i P(X_i \mid \text{Pa}(X_i))$. But in undirected models, we have no 'parents'—only symmetric neighbors. How do we factor the joint distribution?
The Potential Function Solution:
Instead of conditional probabilities, MRFs use potential functions (also called factors or compatibility functions) defined over cliques:
$$P(\mathbf{X}) = \frac{1}{Z} \prod_{C \in \mathcal{C}} \psi_C(\mathbf{X}_C)$$
where:
Key Properties:
Preview: The Hammersley-Clifford Theorem
A beautiful mathematical result—the Hammersley-Clifford theorem—establishes that any positive distribution satisfying the Markov properties with respect to a graph $G$ can be factorized using potential functions over the maximal cliques of $G$. Conversely, any distribution defined this way satisfies the Markov properties.
This theorem, which we'll explore in detail later, provides the theoretical foundation for MRF representations and justifies why clique structure is so important.
The partition function $Z$ requires summing over all possible configurations of all variables. For $n$ binary variables, this means $2^n$ terms. This exponential blow-up makes exact computation of $Z$ intractable for most interesting models, and is a central challenge in MRF inference. We'll address this through approximate inference methods.
We've established the foundational concepts of undirected graphical models. Let's consolidate what we've learned:
What's Next:
With the graph structure in place, we now turn to the mathematical machinery that gives MRFs their expressive power. The next page explores Potential Functions—the compatibility scores that quantify how favorable different variable configurations are. We'll see how potentials relate to probabilities, how they're parameterized in practice, and how they connect to energy-based formulations from physics.
You now understand the structural foundations of undirected graphical models. You can identify neighborhoods and Markov blankets, find cliques in graphs, determine conditional independence through separation, and contrast undirected with directed models. This foundation prepares you for the mathematical formalism of potential functions and the partition function.