Mathematical Expression Equivalence Validator

MEDIUM20 pts

In the era of Large Language Models (LLMs), evaluating the correctness of mathematical reasoning has become a critical challenge. When an LLM generates a mathematical answer, it may express the solution in a form that differs from the expected ground truth—yet both answers could be mathematically equivalent. For example, 0.5, 1/2, and 2/4 all represent the same value, and a robust evaluation system must recognize this equivalence.

Your Task: Implement a function that determines whether two mathematical answer strings are semantically equivalent, accounting for various representational differences while maintaining numerical precision.

The Core Challenge: Building a reliable math answer validator requires handling multiple expression formats and parsing them into comparable numerical values. Your function must be both precise (correctly identifying equivalences) and safe (gracefully handling unparseable or invalid expressions).

Expression Types to Handle:

Direct String Equality: If both strings are identical, they are trivially equivalent.
Numeric Values: Parse and compare integers (e.g., 42) and floating-point numbers (e.g., 3.14159) with tolerance-based comparison.
Fraction Expressions: Evaluate fractions like 1/2, -3/4, or 22/7 by performing the division and comparing the result.
Square Root Expressions: Parse expressions containing sqrt(n) and evaluate them mathematically. For example, sqrt(4) should equal 2.
Pi Expressions: Recognize pi as the mathematical constant π (approximately 3.14159265...) and handle expressions like 2*pi or pi/2.
Arithmetic Combinations: Handle basic arithmetic operations combining the above, such as sqrt(2)/2 or 3*pi/4.

Comparison Logic: Two parsed numerical values should be considered equivalent if their absolute difference is less than or equal to the specified tolerance (default: 1e-6).

Edge Case Handling:

If both strings match exactly, return True immediately
If either expression cannot be parsed to a numerical value, and the strings don't match exactly, return False
Handle potential division by zero, invalid sqrt arguments, and other mathematical errors gracefully

Example

Input

predicted = '1/2'
ground_truth = '0.5'

Output

True

Explanation

The function first checks for string equality (they differ). It then parses '1/2' as a fraction, computing 1 ÷ 2 = 0.5. The ground truth '0.5' is parsed directly as a float. Since |0.5 - 0.5| = 0 ≤ 1e-6 (the default tolerance), the function returns True, confirming mathematical equivalence.

Example

Input

predicted = '42'
ground_truth = '42'

Output

True

Explanation

Both strings are identical, so the function immediately returns True via the direct string equality check. No numerical parsing is required for this trivial case.

Example

Input

predicted = 'sqrt(4)'
ground_truth = '2'

Output

True

Explanation

The function parses 'sqrt(4)' by identifying the sqrt() function and computing √4 = 2.0. The ground truth '2' is parsed as the integer 2. Since |2.0 - 2| = 0 ≤ 1e-6, the answers are considered equivalent.

Accepted0/0·0% Acceptance

Constraints

1 ≤ length of predicted ≤ 100 characters
1 ≤ length of ground_truth ≤ 100 characters
Expressions may contain digits 0-9, decimal points, arithmetic operators (+, -, *, /), parentheses, 'sqrt', and 'pi'
Numerical values will be within the range of float64 precision
The default tolerance is 1e-6 for floating-point comparison
Expressions are case-sensitive ('pi' is valid, 'PI' or 'Pi' are treated as unparseable)
Whitespace handling is implementation-defined but should be consistent

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

predicted =

"1/2"

ground_truth =

"0.5"

Mathematical Expression Equivalence Validator

Hints

Mathematical Expression Equivalence Validator

Hints