00:00:00

Description

Editorial

Conditional Probability Estimation from Categorical Observations

EASY10 pts

Conditional probability is a foundational concept in probability theory and statistics that quantifies the likelihood of an event occurring given that another event has already occurred. It forms the backbone of Bayesian inference, classification algorithms, and countless real-world decision-making systems.

Given a collection of paired categorical observations, where each observation consists of two variables X (the conditioning variable) and Y (the target variable), we want to estimate the conditional probability that Y takes a specific value y given that X takes a specific value x.

Mathematically, this is expressed as:

$$P(Y = y \mid X = x) = \frac{\text{Number of observations where } X = x \text{ AND } Y = y}{\text{Total number of observations where } X = x}$$

Understanding the Formula:

The numerator counts how many times both conditions are satisfied simultaneously (X equals the conditioning value AND Y equals the target value)
The denominator counts the total occurrences where the conditioning variable X equals the specified value
The ratio gives us the proportion of times Y equals the target value among all cases where X equals the conditioning value

Edge Case Handling: When no observations exist with the conditioning value (i.e., when X never equals x in the dataset), the conditional probability is undefined in a strict mathematical sense. In this implementation, we return 0.0 to handle this edge case gracefully.

Your Task: Implement a function that takes a dataset of categorical observation pairs and computes the conditional probability of a target outcome given a specific condition. The function should efficiently count the relevant occurrences and compute the probability accordingly.

Example

Input

observations = [("red", "cat"), ("blue", "dog"), ("red", "dog"), ("red", "cat"), ("blue", "cat"), ("red", "dog")]
condition_value = "red"
target_value = "cat"

Output

0.5

Explanation

To compute P(Y = "cat" | X = "red"), we analyze the observations:

Step 1: Filter observations where X = "red": • ("red", "cat") ✓ • ("red", "dog") ✓ • ("red", "cat") ✓ • ("red", "dog") ✓

Total observations with X = "red": 4

Step 2: Count observations where X = "red" AND Y = "cat": • ("red", "cat") ✓ • ("red", "cat") ✓

Count where both conditions are met: 2

Step 3: Calculate conditional probability: P(Y = "cat" | X = "red") = 2 / 4 = 0.5

This means that among all observations with attribute "red", exactly half have the outcome "cat".

Example

Input

observations = [("red", "cat"), ("blue", "dog"), ("red", "dog"), ("red", "cat"), ("blue", "cat"), ("red", "dog")]
condition_value = "blue"
target_value = "cat"

Output

0.5

Explanation

To compute P(Y = "cat" | X = "blue"), we analyze the observations:

Step 1: Filter observations where X = "blue": • ("blue", "dog") ✓ • ("blue", "cat") ✓

Total observations with X = "blue": 2

Step 2: Count observations where X = "blue" AND Y = "cat": • ("blue", "cat") ✓

Count where both conditions are met: 1

Step 3: Calculate conditional probability: P(Y = "cat" | X = "blue") = 1 / 2 = 0.5

Among observations with attribute "blue", half have the outcome "cat".

Example

Input

observations = [("red", "cat"), ("blue", "dog"), ("red", "dog"), ("red", "cat"), ("blue", "cat"), ("red", "dog")]
condition_value = "green"
target_value = "cat"

Output

0.0

Explanation

To compute P(Y = "cat" | X = "green"), we analyze the observations:

Step 1: Filter observations where X = "green": • (No observations match)

Total observations with X = "green": 0

Step 2: Handle edge case: Since there are no observations with X = "green", computing the conditional probability would result in division by zero. Per our convention, we return 0.0 in this case.

This edge case commonly occurs in practice when querying for unseen attribute values or when dealing with sparse categorical data.

Accepted0/0·0% Acceptance

Constraints

1 ≤ len(observations) ≤ 10,000
Each observation is a tuple (X, Y) of two non-empty strings
String lengths: 1 ≤ len(condition_value), len(target_value) ≤ 100
All string values are case-sensitive
The dataset may contain duplicate observations
condition_value and target_value may or may not appear in the observations

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

x =

"red"

y =

"cat"

data =

[["red","cat"],["blue","dog"],["red","dog"],["red","cat"],["blue","cat"],["red","dog"]]

Loading problem...

101

00:00:00

Description

Editorial

Conditional Probability Estimation from Categorical Observations

EASY10 pts

Mathematically, this is expressed as:

$$P(Y = y \mid X = x) = \frac{\text{Number of observations where } X = x \text{ AND } Y = y}{\text{Total number of observations where } X = x}$$

Understanding the Formula:

The numerator counts how many times both conditions are satisfied simultaneously (X equals the conditioning value AND Y equals the target value)
The denominator counts the total occurrences where the conditioning variable X equals the specified value
The ratio gives us the proportion of times Y equals the target value among all cases where X equals the conditioning value

Example

Input

observations = [("red", "cat"), ("blue", "dog"), ("red", "dog"), ("red", "cat"), ("blue", "cat"), ("red", "dog")]
condition_value = "red"
target_value = "cat"

Output

0.5

Explanation

To compute P(Y = "cat" | X = "red"), we analyze the observations:

Step 1: Filter observations where X = "red": • ("red", "cat") ✓ • ("red", "dog") ✓ • ("red", "cat") ✓ • ("red", "dog") ✓

Total observations with X = "red": 4

Step 2: Count observations where X = "red" AND Y = "cat": • ("red", "cat") ✓ • ("red", "cat") ✓

Count where both conditions are met: 2

Step 3: Calculate conditional probability: P(Y = "cat" | X = "red") = 2 / 4 = 0.5

This means that among all observations with attribute "red", exactly half have the outcome "cat".

Example

Input

observations = [("red", "cat"), ("blue", "dog"), ("red", "dog"), ("red", "cat"), ("blue", "cat"), ("red", "dog")]
condition_value = "blue"
target_value = "cat"

Output

0.5

Explanation

To compute P(Y = "cat" | X = "blue"), we analyze the observations:

Step 1: Filter observations where X = "blue": • ("blue", "dog") ✓ • ("blue", "cat") ✓

Total observations with X = "blue": 2

Step 2: Count observations where X = "blue" AND Y = "cat": • ("blue", "cat") ✓

Count where both conditions are met: 1

Step 3: Calculate conditional probability: P(Y = "cat" | X = "blue") = 1 / 2 = 0.5

Among observations with attribute "blue", half have the outcome "cat".

Example

Input

observations = [("red", "cat"), ("blue", "dog"), ("red", "dog"), ("red", "cat"), ("blue", "cat"), ("red", "dog")]
condition_value = "green"
target_value = "cat"

Output

0.0

Explanation

To compute P(Y = "cat" | X = "green"), we analyze the observations:

Step 1: Filter observations where X = "green": • (No observations match)

Total observations with X = "green": 0

This edge case commonly occurs in practice when querying for unseen attribute values or when dealing with sparse categorical data.

Accepted0/0·0% Acceptance

Constraints

1 ≤ len(observations) ≤ 10,000
Each observation is a tuple (X, Y) of two non-empty strings
String lengths: 1 ≤ len(condition_value), len(target_value) ≤ 100
All string values are case-sensitive
The dataset may contain duplicate observations
condition_value and target_value may or may not appear in the observations

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

x =

"red"

y =

"cat"

data =

[["red","cat"],["blue","dog"],["red","dog"],["red","cat"],["blue","cat"],["red","dog"]]

Conditional Probability Estimation from Categorical Observations

Hints

Conditional Probability Estimation from Categorical Observations

Hints