Loading content...
When working with large matrices in computational applications, it's common to encounter sparse matrices—matrices where the vast majority of elements are zero. Storing these matrices in their full, dense form wastes significant memory. The Compressed Sparse Row (CSR) format is a highly efficient representation that stores only the non-zero elements along with their positional metadata.
The CSR format represents a sparse matrix using three separate arrays:
Values Array: Contains all non-zero elements of the matrix, collected in row-major order (reading left-to-right, top-to-bottom).
Column Indices Array: For each non-zero element in the values array, stores the column index where that element originally appeared.
Row Pointer Array: An array of length (number of rows + 1) where each entry indicates the starting position in the values array for that row. The difference between consecutive entries gives the count of non-zero elements in each row.
For example, if the row pointer array is [0, 1, 2, 4, 6], it means:
Your Task:
Write a Python function that takes a dense 2D matrix as input and converts it into CSR format, returning the three component arrays as a tuple: (values, column_indices, row_pointer).
dense_matrix = [
[1, 0, 0, 0],
[0, 2, 0, 0],
[3, 0, 4, 0],
[1, 0, 0, 5]
]values = [1, 2, 3, 4, 1, 5]
col_indices = [0, 1, 0, 2, 0, 3]
row_ptr = [0, 1, 2, 4, 6]Scanning row by row: • Row 0: Element 1 at column 0 → values=[1], col_indices=[0] • Row 1: Element 2 at column 1 → values=[1,2], col_indices=[0,1] • Row 2: Element 3 at column 0, element 4 at column 2 → values=[1,2,3,4], col_indices=[0,1,0,2] • Row 3: Element 1 at column 0, element 5 at column 3 → values=[1,2,3,4,1,5], col_indices=[0,1,0,2,0,3]
Row pointer tracks starting indices: [0, 1, 2, 4, 6], where 6 is the total count of non-zeros.
dense_matrix = [
[1, 0],
[0, 2]
]values = [1, 2]
col_indices = [0, 1]
row_ptr = [0, 1, 2]This is a 2×2 diagonal matrix with only diagonal elements non-zero: • Row 0: Value 1 at column 0 • Row 1: Value 2 at column 1
The row pointer [0, 1, 2] indicates row 0 starts at index 0, row 1 at index 1, and total elements is 2.
dense_matrix = [
[1, 2, 0],
[0, 3, 4],
[5, 0, 6]
]values = [1, 2, 3, 4, 5, 6]
col_indices = [0, 1, 1, 2, 0, 2]
row_ptr = [0, 2, 4, 6]Each row has exactly 2 non-zero elements: • Row 0: Values 1, 2 at columns 0, 1 • Row 1: Values 3, 4 at columns 1, 2 • Row 2: Values 5, 6 at columns 0, 2
Row pointer [0, 2, 4, 6] reflects 2 elements per row with uniform distribution.
Constraints