Loading content...
In multivariable calculus and optimization theory, the second-order curvature matrix (commonly known as the Hessian matrix) is a square matrix that encapsulates all second-order partial derivatives of a scalar-valued function. This mathematical object provides profound insights into the local geometry of a function's surface, revealing critical information about curvature, concavity, and optimization landscapes.
For a scalar function f(x₁, x₂, ..., xₙ) of n variables, the curvature matrix H is an n × n symmetric matrix where each element Hᵢⱼ represents the mixed second partial derivative:
$$H_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j}$$
The elements along the main diagonal represent the pure second derivatives (∂²f/∂xᵢ²), while off-diagonal elements capture how the rate of change in one direction varies with respect to another direction.
Numerical Approximation via Finite Differences:
When analytical derivatives are unavailable or impractical to compute, we employ numerical differentiation using finite difference methods. The second partial derivative can be approximated using the central difference formula:
$$\frac{\partial^2 f}{\partial x_i^2} \approx \frac{f(x + h\cdot e_i) - 2f(x) + f(x - h\cdot e_i)}{h^2}$$
For mixed partial derivatives (where i ≠ j):
$$\frac{\partial^2 f}{\partial x_i \partial x_j} \approx \frac{f(x + h\cdot e_i + h\cdot e_j) - f(x + h\cdot e_i - h\cdot e_j) - f(x - h\cdot e_i + h\cdot e_j) + f(x - h\cdot e_i - h\cdot e_j)}{4h^2}$$
where eᵢ is the unit vector in the i-th direction and h is a small step size (typically around 1e-5).
Significance in Optimization:
The curvature matrix is indispensable for:
Your Task:
Write a Python function that numerically computes the second-order curvature matrix for a given function at a specified point. Your implementation should:
function_type = "sum_of_squares_2d"
point = [0.0, 0.0][[2.0, 0.0], [0.0, 2.0]]For the function f(x, y) = x² + y² (sum of squares):
• ∂²f/∂x² = 2 (constant curvature in x-direction) • ∂²f/∂y² = 2 (constant curvature in y-direction) • ∂²f/∂x∂y = 0 (no cross-dependence between variables)
The resulting curvature matrix H = [[2, 0], [0, 2]] is a scaled identity matrix. The positive eigenvalues (both equal to 2) indicate this is a convex paraboloid with a global minimum at the origin. The equal eigenvalues show the curvature is identical in all directions (isotropic curvature).
function_type = "product_2d"
point = [1.0, 1.0][[0.0, 1.0], [1.0, 0.0]]For the function f(x, y) = x·y (simple product):
• ∂f/∂x = y, so ∂²f/∂x² = 0 • ∂f/∂y = x, so ∂²f/∂y² = 0 • ∂²f/∂x∂y = ∂(y)/∂y = 1
The curvature matrix H = [[0, 1], [1, 0]] has eigenvalues +1 and -1, indicating this is a saddle surface. The indefinite nature (mixed positive/negative eigenvalues) reveals that the point (1, 1) is neither a local minimum nor maximum, but a saddle point where curvature is positive in one direction and negative in another.
function_type = "cubic_2d"
point = [1.0, 2.0][[6.0, 0.0], [0.0, 12.0]]For the function f(x, y) = x³ + y³ (sum of cubes):
• ∂f/∂x = 3x², so ∂²f/∂x² = 6x → at x=1: value is 6 • ∂f/∂y = 3y², so ∂²f/∂y² = 6y → at y=2: value is 12 • ∂²f/∂x∂y = 0 (variables are independent)
The curvature matrix H = [[6, 0], [0, 12]] at point (1, 2). Unlike the sum of squares function, the curvature here depends on the evaluation point. Both eigenvalues are positive, indicating local convexity, but the different magnitudes (6 vs 12) show anisotropic curvature — the surface curves more steeply in the y-direction than the x-direction.
Constraints