00:00:00

Description

Editorial

State Top Fraud Band Claims

MEDIUM20 pts

A national insurance operations team runs a risk model that scores every submitted claim. Senior adjusters can only review a limited slice, so each state sends only its highest-risk tier for manual investigation.

Table: Fraud

policy_id: unique policy/claim identifier
state: state name
fraud_score: model-assigned risk score (higher means riskier)

Task: For every state independently, return the top 5% highest-risk policies.

Formal selection rule per state:

Let n be the number of policies in that state.
Compute top_k = max(1, ceil(0.05 * n)).
Order rows by fraud_score DESC, then policy_id ASC.
Keep exactly the first top_k rows.

Output requirements:

Return exactly these columns: policy_id, state, fraud_score
Sort final output by state ASC, fraud_score DESC, policy_id ASC

Important notes:

Each state contributes at least one row to the output.
The rounding rule is always upward when computing the 5% cutoff.
Ties on fraud_score are resolved by smaller policy_id first.

Supported submission environments:

SQL (SQLite 3.27.2)
Pandas (Python)

Example

Input

Fraud:
| policy_id | state      | fraud_score |
|-----------|------------|-------------|
| 1         | California | 0.92        |
| 2         | California | 0.68        |
| 3         | California | 0.17        |
| 4         | New York   | 0.94        |
| 5         | New York   | 0.81        |
| 6         | New York   | 0.77        |
| 7         | Texas      | 0.98        |
| 8         | Texas      | 0.97        |
| 9         | Texas      | 0.96        |
| 10        | Florida    | 0.97        |
| 11        | Florida    | 0.98        |
| 12        | Florida    | 0.78        |
| 13        | Florida    | 0.88        |
| 14        | Florida    | 0.66        |

Output

[
  {"policy_id":1,"state":"California","fraud_score":0.92},
  {"policy_id":11,"state":"Florida","fraud_score":0.98},
  {"policy_id":4,"state":"New York","fraud_score":0.94},
  {"policy_id":7,"state":"Texas","fraud_score":0.98}
]

Explanation

Each state has fewer than 20 rows, so top_k = 1 everywhere. We keep one highest-risk policy per state.

Example

Input

Fraud:
| policy_id | state | fraud_score |
|-----------|-------|-------------|
| 101       | Ohio  | 0.95        |
| 102       | Ohio  | 0.95        |
| 103       | Ohio  | 0.90        |
| ...       | ...   | ...         |
| 124       | Ohio  | 0.21        |

Output

[
  {"policy_id":101,"state":"Ohio","fraud_score":0.95},
  {"policy_id":102,"state":"Ohio","fraud_score":0.95}
]

Explanation

Ohio has n = 24, so top_k = ceil(24 * 0.05) = 2. Two rows are selected. The tie at score 0.95 is resolved by ascending policy_id.

Example

Input

Fraud:
| policy_id | state   | fraud_score |
|-----------|---------|-------------|
| 2001      | Arizona | 88.7        |
| 2002      | Arizona | 92.5        |
| 2003      | Arizona | 92.5        |
| 2004      | Arizona | 91.1        |
| 3001      | Nevada  | 70.0        |

Output

[
  {"policy_id":2002,"state":"Arizona","fraud_score":92.5},
  {"policy_id":3001,"state":"Nevada","fraud_score":70.0}
]

Explanation

Arizona has 4 rows so top_k = 1; among tied 92.5 rows, policy_id 2002 wins. Nevada has 1 row so it is always selected.

Accepted0/0·0% Acceptance

Constraints

1 <= number of rows in Fraud <= 300000
policy_id values are unique
1 <= policy_id <= 2000000000
state is a non-empty string identifier
0 <= fraud_score <= 1000000
Every state has at least one row
Per-state selection size is top_k = max(1, ceil(0.05 * state_row_count))
Ranking inside each state is fraud_score DESC, then policy_id ASC
Output must include exactly: policy_id, state, fraud_score
Final output ordering must be state ASC, fraud_score DESC, policy_id ASC
Use SQLite 3.27.2 semantics for SQL submissions
For Pandas submissions, return a DataFrame with columns [policy_id, state, fraud_score]

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

Fraud =

[{"policy_id":1,"state":"California","fraud_score":0.92},{"policy_id":2,"state":"California","fraud_score":0.68},{"policy_id":3,"state":"California","fraud_score":0.17},{"policy_id":4,"state":"New York","fraud_score":0.94},{"policy_id":5,"state":"New York","fraud_score":0.81},{"policy_id":6,"state":"New York","fraud_score":0.77},{"policy_id":7,"state":"Texas","fraud_score":0.98},{"policy_id":8,"state":"Texas","fraud_score":0.97},{"policy_id":9,"state":"Texas","fraud_score":0.96},{"policy_id":10,"state":"Florida","fraud_score":0.97},{"policy_id":11,"state":"Florida","fraud_score":0.98},{"policy_id":12,"state":"Florida","fraud_score":0.78},{"policy_id":13,"state":"Florida","fraud_score":0.88},{"policy_id":14,"state":"Florida","fraud_score":0.66}]

Loading problem...

101

00:00:00

Description

Editorial

State Top Fraud Band Claims

MEDIUM20 pts

Table: Fraud

policy_id: unique policy/claim identifier
state: state name
fraud_score: model-assigned risk score (higher means riskier)

Task: For every state independently, return the top 5% highest-risk policies.

Formal selection rule per state:

Let n be the number of policies in that state.
Compute top_k = max(1, ceil(0.05 * n)).
Order rows by fraud_score DESC, then policy_id ASC.
Keep exactly the first top_k rows.

Output requirements:

Return exactly these columns: policy_id, state, fraud_score
Sort final output by state ASC, fraud_score DESC, policy_id ASC

Important notes:

Each state contributes at least one row to the output.
The rounding rule is always upward when computing the 5% cutoff.
Ties on fraud_score are resolved by smaller policy_id first.

Supported submission environments:

SQL (SQLite 3.27.2)
Pandas (Python)

Example

Input

Fraud:
| policy_id | state      | fraud_score |
|-----------|------------|-------------|
| 1         | California | 0.92        |
| 2         | California | 0.68        |
| 3         | California | 0.17        |
| 4         | New York   | 0.94        |
| 5         | New York   | 0.81        |
| 6         | New York   | 0.77        |
| 7         | Texas      | 0.98        |
| 8         | Texas      | 0.97        |
| 9         | Texas      | 0.96        |
| 10        | Florida    | 0.97        |
| 11        | Florida    | 0.98        |
| 12        | Florida    | 0.78        |
| 13        | Florida    | 0.88        |
| 14        | Florida    | 0.66        |

Output

[
  {"policy_id":1,"state":"California","fraud_score":0.92},
  {"policy_id":11,"state":"Florida","fraud_score":0.98},
  {"policy_id":4,"state":"New York","fraud_score":0.94},
  {"policy_id":7,"state":"Texas","fraud_score":0.98}
]

Explanation

Each state has fewer than 20 rows, so top_k = 1 everywhere. We keep one highest-risk policy per state.

Example

Input

Fraud:
| policy_id | state | fraud_score |
|-----------|-------|-------------|
| 101       | Ohio  | 0.95        |
| 102       | Ohio  | 0.95        |
| 103       | Ohio  | 0.90        |
| ...       | ...   | ...         |
| 124       | Ohio  | 0.21        |

Output

[
  {"policy_id":101,"state":"Ohio","fraud_score":0.95},
  {"policy_id":102,"state":"Ohio","fraud_score":0.95}
]

Explanation

Ohio has n = 24, so top_k = ceil(24 * 0.05) = 2. Two rows are selected. The tie at score 0.95 is resolved by ascending policy_id.

Example

Input

Fraud:
| policy_id | state   | fraud_score |
|-----------|---------|-------------|
| 2001      | Arizona | 88.7        |
| 2002      | Arizona | 92.5        |
| 2003      | Arizona | 92.5        |
| 2004      | Arizona | 91.1        |
| 3001      | Nevada  | 70.0        |

Output

[
  {"policy_id":2002,"state":"Arizona","fraud_score":92.5},
  {"policy_id":3001,"state":"Nevada","fraud_score":70.0}
]

Explanation

Arizona has 4 rows so top_k = 1; among tied 92.5 rows, policy_id 2002 wins. Nevada has 1 row so it is always selected.

Accepted0/0·0% Acceptance

Constraints

1 <= number of rows in Fraud <= 300000
policy_id values are unique
1 <= policy_id <= 2000000000
state is a non-empty string identifier
0 <= fraud_score <= 1000000
Every state has at least one row
Per-state selection size is top_k = max(1, ceil(0.05 * state_row_count))
Ranking inside each state is fraud_score DESC, then policy_id ASC
Output must include exactly: policy_id, state, fraud_score
Final output ordering must be state ASC, fraud_score DESC, policy_id ASC
Use SQLite 3.27.2 semantics for SQL submissions
For Pandas submissions, return a DataFrame with columns [policy_id, state, fraud_score]

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

Fraud =

State Top Fraud Band Claims

Hints

State Top Fraud Band Claims

Hints