Loading problem...
In Natural Language Processing (NLP) and machine learning evaluation pipelines, measuring how accurately a model's predictions match expected outputs is fundamental. One widely used approach is Normalized String Matching, which compares predicted text against ground truth references after applying standardization transformations.
This technique is especially valuable when evaluating question-answering systems, text generation models, and information extraction pipelines, where superficial differences in formatting, capitalization, or punctuation should not penalize otherwise correct answers.
The Normalization Process: Before comparing strings, both predictions and references undergo a normalization pipeline:
After normalization, strings are compared for exact equality. The accuracy score is computed as the fraction of predictions that perfectly match their corresponding normalized references.
Your Task: Implement a function that takes two parallel lists—predictions and references—and returns a float representing the accuracy score. The score should be the proportion of normalized predictions that exactly match their normalized counterparts, yielding a value between 0.0 (no matches) and 1.0 (all match).
Edge Case: If both input lists are empty, your function should return 0.0.
predictions = ['Hello, World!', 'The answer is 42']
references = ['hello world', 'the answer is 42']1.0Both predictions perfectly match their references after normalization:
• 'Hello, World!' → 'hello world' (lowercase + comma removed) ✓ matches 'hello world' • 'The answer is 42' → 'the answer is 42' (lowercase) ✓ matches 'the answer is 42'
With 2 out of 2 matches, the accuracy score is 2/2 = 1.0
predictions = ['Hello World', 'Good Morning', 'Test Case']
references = ['hello world', 'good evening', 'test case']0.6666666667After normalization:
• 'Hello World' → 'hello world' ✓ matches 'hello world' • 'Good Morning' → 'good morning' ✗ does NOT match 'good evening' • 'Test Case' → 'test case' ✓ matches 'test case'
With 2 out of 3 matches, the accuracy score is 2/3 ≈ 0.6667
predictions = ["What's the answer?"]
references = ['whats the answer']1.0The apostrophe in "What's" is punctuation and gets removed during normalization:
• "What's the answer?" → 'whats the answer' (lowercase + apostrophe + question mark removed) ✓ matches 'whats the answer'
With 1 out of 1 match, the accuracy score is 1.0
Constraints