DRY Principle - Learning Module

Loading content...

0/246

DRY Violations and Fixes

From Theory to Practice

Understanding the DRY principle and the distinction between knowledge and code duplication is essential—but knowing how to identify and fix violations in real codebases is where the principle becomes actionable.

DRY violations manifest in many forms, from obvious copy-pasted blocks to subtle structural repetitions that span multiple files or even systems. Fixing them requires choosing the right abstraction mechanism and applying it at the right time. This page provides a systematic approach to both identification and remediation.

What You Will Learn

By the end of this page, you will be able to identify the major categories of DRY violations, apply appropriate remediation strategies for each category, and refactor duplication without creating the wrong abstraction. You'll learn both the techniques and the judgment to apply them wisely.

Categories of DRY Violations

DRY violations can be classified by their scope and nature. Understanding the category helps select the appropriate fix.

By Scope:

Intra-method duplication — The same logic repeated within a single function
Intra-class duplication — The same patterns across methods of one class
Intra-module duplication — Similar code in different classes within a module
Cross-module duplication — The same knowledge in different modules or packages
Cross-system duplication — Knowledge repeated across applications, services, or layers

By Nature:

Literal duplication — Exact copies of code (easiest to detect)
Parametric duplication — Similar code differing only in values
Structural duplication — Same patterns with different names/types
Semantic duplication — Same knowledge in different representations

Violation Categories and Detection Difficulty
Category	Example	Detection	Fix Complexity
Literal (exact copy)	Copy-pasted function	Trivial (tools)	Low
Parametric	Same logic, different constants	Easy (tools)	Low-Medium
Structural	Same pattern, different types	Medium (review)	Medium
Semantic	Same rule, different expressions	Hard (requires understanding)	High
Cross-system	Logic in frontend and backend	Very Hard	High

Prioritize by Impact

Not all violations deserve equal attention. Prioritize fixing duplication that: (1) represents frequently-changing knowledge, (2) has caused bugs from missed updates, (3) appears in critical business logic, or (4) spans system boundaries where inconsistency hurts users.

Detecting DRY Violations

Detection strategies range from automated tools to human insight. Different types of duplication require different detection approaches.

Automated Detection:

Static analysis tools can identify literal and parametric duplication:

Clone detectors (PMD CPD, SonarQube, IntelliJ Clone Detection) find exact or near-exact code copies
Metrics tools flag methods with similar structure or size
Custom linters can check for specific patterns (e.g., magic numbers, repeated strings)

Limitations: Automated tools only catch syntactic duplication. They miss semantic duplication entirely and may flag coincidental similarity as false positives.

detection-patterns
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
// DETECTION PATTERN 1: Magic Numbers
// Tools can flag repeated numeric literals
 
// ❌ Violation: "30" appears in multiple places
const sessionTimeout = 30 * 60 * 1000;  // 30 minutes
const cacheExpiry = 30 * 60 * 1000;     // Also 30 minutes?
const maxRetryWait = 30 * 1000;         // 30 seconds?
 
// Are these the SAME "30" or different concepts?
// - Session timeout: security policy
// - Cache expiry: performance tuning
// - Retry wait: infrastructure config
// Detection found them, but analysis is needed!
 
// ✅ Fix: Named constants make knowledge explicit
const SESSION_TIMEOUT_MINUTES = 30;
const CACHE_EXPIRY_MINUTES = 30;  // Same value, different concept!
const MAX_RETRY_WAIT_SECONDS = 30;
 
// If session policy changes to 60 min, only SESSION changes
// If cache tuning adjusts, only CACHE changes
 
 
// DETECTION PATTERN 2: Repeated String Literals
// Especially dangerous with URLs, keys, messages
 
// ❌ Violation: Same error message in multiple places
throw new Error("Invalid user ID format");
// ... elsewhere in codebase ...
throw new Error("Invalid user ID format");
 
// When message copy changes, will both be updated?
 
// ✅ Fix: Centralized error messages
const ErrorMessages = {
  INVALID_USER_ID: "Invalid user ID format",
  // ... other messages
} as const;
 
throw new Error(ErrorMessages.INVALID_USER_ID);
 
 
// DETECTION PATTERN 3: Structural Similarity
// These have the same SHAPE even with different contents
 
// ❌ Violation: Same pattern repeated
class UserRepository {
  async findById(id: string): Promise<User | null> {
    return this.db.query("SELECT * FROM users WHERE id = ?", [id]);
  }
  async findByEmail(email: string): Promise<User | null> {
    return this.db.query("SELECT * FROM users WHERE email = ?", [email]);
  }
  async findByUsername(username: string): Promise<User | null> {
    return this.db.query("SELECT * FROM users WHERE username = ?", [username]);
  }
}
 
// Same pattern: field differs, query structure identical
// Is this essential duplication? Depends on whether the
// pattern represents shared knowledge (yes) or just
// coincidentally similar operations (debatable)

Human Detection:

For semantic duplication, human insight is essential:

Code reviews — Reviewers familiar with the codebase spot duplicated knowledge
Team wikis — Documentation of existing solutions prevents reinvention
Pair programming — Shared context reduces inadvertent duplication
Onboarding shadowing — New developers asking "Is there already a way to do X?" surfaces hidden duplication
Bug archaeology — Finding that the same bug was fixed in multiple places reveals duplication

A key human detection signal: "I've seen this before." If you have a vague feeling that similar code exists elsewhere, investigate. That intuition often reveals knowledge duplication.

The Bug Fix Test

When fixing a bug, search the codebase for similar code. If you find related logic that might have the same bug, you've discovered duplication. This "shotgun bug" pattern—where one conceptual bug requires fixes in multiple places—is a reliable duplication indicator.

Remediation Strategies

Once a DRY violation is identified and confirmed as essential (not coincidental), the next step is choosing the right abstraction to eliminate it. The choice depends on the nature of the duplication.

Strategy Menu:

Abstraction Mechanisms

•Extract Function/Method — Pull duplicated logic into a named function. Simplest fix for procedural duplication within a scope.
•Extract Constant/Configuration — For repeated values. Move magic numbers and strings to named constants or config files.
•Parameterization — When duplicates differ in values, make those values parameters. The shared logic becomes a function with arguments.
•Extract Class/Module — When duplication represents a coherent concept deserving its own type. The new abstraction encapsulates the knowledge.
•Inheritance — When duplicates share behavior across a type hierarchy. Use sparingly; prefer composition unless true "is-a" relationship exists.
•Composition — Extract shared behavior to a component that can be composed into different classes. More flexible than inheritance.
•Template Method/Strategy — When the duplicated code has structure with varying parts. The invariant structure is the template; the variant parts are strategies or hooks.
•Code Generation — When duplication crosses boundaries (languages, files). Generate from a single schema or definition.

strategy-selection
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
// STRATEGY SELECTION BASED ON DUPLICATION TYPE
 
// ────────────────────────────────────────────────────────────
// SCENARIO 1: Repeated Calculation (Extract Function)
// ────────────────────────────────────────────────────────────
 
// Before:
function processOrder(order: Order) {
  const subtotal = order.items.reduce((sum, i) => sum + i.price * i.qty, 0);
  const tax = subtotal * 0.08;
  const total = subtotal + tax;
  // ... use total
}
 
function displayCart(items: CartItem[]) {
  const subtotal = items.reduce((sum, i) => sum + i.price * i.qty, 0);
  const tax = subtotal * 0.08;
  const total = subtotal + tax;
  // ... display total
}
 
// After: Extract function
function calculateOrderTotal(items: { price: number; qty: number }[]): {
  subtotal: number;
  tax: number;
  total: number;
} {
  const subtotal = items.reduce((sum, i) => sum + i.price * i.qty, 0);
  const tax = subtotal * TAX_RATE;  // And extract constant!
  return { subtotal, tax, total: subtotal + tax };
}
 
 
// ────────────────────────────────────────────────────────────
// SCENARIO 2: Structural Pattern (Extract Class)
// ────────────────────────────────────────────────────────────
 
// Before: HTTP call pattern repeated across services
async function fetchUsers(): Promise<User[]> {
  try {
    const response = await fetch('/api/users');
    if (!response.ok) throw new Error('User fetch failed');
    return response.json();
  } catch (error) {
    logger.error('Fetch users error', error);
    throw error;
  }
}
 
async function fetchOrders(): Promise<Order[]> {
  try {
    const response = await fetch('/api/orders');
    if (!response.ok) throw new Error('Order fetch failed');
    return response.json();
  } catch (error) {
    logger.error('Fetch orders error', error);
    throw error;
  }
}
 
// After: Extract class with parameterization
class ApiClient {
  async fetch<T>(endpoint: string, entityName: string): Promise<T> {
    try {
      const response = await fetch(endpoint);
      if (!response.ok) throw new Error(`${entityName} fetch failed`);
      return response.json();
    } catch (error) {
      logger.error(`Fetch ${entityName} error`, error);
      throw error;
    }
  }
}
 
 
// ────────────────────────────────────────────────────────────
// SCENARIO 3: Varying Logic (Strategy Pattern)
// ────────────────────────────────────────────────────────────
 
// Before: Discount calculation with variations
function applyDiscount(order: Order, type: string): number {
  // Shared: get base price
  const base = calculateSubtotal(order);
  
  // Varying: discount logic
  if (type === 'percentage') {
    return base * (1 - order.discountValue / 100);
  } else if (type === 'fixed') {
    return Math.max(0, base - order.discountValue);
  } else if (type === 'buyOneGetOne') {
    // Complex BOGO logic
    return calculateBOGOPrice(order);
  }
  return base;
}
 
// After: Strategy pattern
interface DiscountStrategy {
  apply(basePrice: number, params: DiscountParams): number;
}
 
class PercentageDiscount implements DiscountStrategy {
  apply(base: number, params: DiscountParams): number {
    return base * (1 - params.value / 100);
  }
}
 
class FixedDiscount implements DiscountStrategy {
  apply(base: number, params: DiscountParams): number {
    return Math.max(0, base - params.value);
  }
}
 
// New discounts added by creating new classes, not modifying code

Common DRY Violations and Their Fixes

Let's examine the most frequently encountered DRY violations in professional codebases and their standard remedies.

1. Validation Duplication:

Validation logic is one of the most commonly duplicated types of knowledge. The same business rules are often enforced in frontend, backend, API layer, and database.

validation-dry
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// ❌ VIOLATION: Validation rules in multiple places
 
// Frontend (React)
function validateEmail(email: string): boolean {
  const pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return pattern.test(email) && email.length <= 254;
}
 
// Backend (Express)
function validateEmailRequest(email: string): boolean {
  const pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return pattern.test(email) && email.length <= 254;
}
 
// Database (Check constraint duplicates the 254 limit)
// API Docs (Describes the same rules in prose)
 
 
// ✅ FIX: Single source of truth for validation schemas
 
// Shared validation schema (could be JSON Schema, Zod, Yup, etc.)
const emailSchema = z.string()
  .email("Invalid email format")
  .max(254, "Email too long");
 
// Generated/derived everywhere else:
// - Frontend: import { emailSchema } from '@shared/schemas'
// - Backend: same schema, same library
// - API Docs: generated from schema (OpenAPI integration)
// - Database: could generate CHECK constraint from schema
 
// Now the rule exists in ONE place

2. Configuration Duplication:

The same configuration values scattered across files, environments, and services.

config-dry
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// ❌ VIOLATION: Same API URL in multiple places
 
// service-a/config.ts
const API_URL = "https://api.example.com/v2";
 
// service-b/config.ts  
const API_BASE = "https://api.example.com/v2";
 
// frontend/.env
REACT_APP_API=https://api.example.com/v2
 
// tests/fixtures.ts
const testApi = "https://api.example.com/v2";
 
 
// ✅ FIX: Centralized configuration with derivation
 
// Option 1: Environment variables from single source
// .env.template (source of truth)
API_URL=https://api.example.com/v2
 
// All services read from environment
const apiUrl = process.env.API_URL;
 
 
// Option 2: Configuration service
// config-service exposes typed configuration
const config = await ConfigService.get();
const apiUrl = config.api.baseUrl;  // Single source
 
 
// Option 3: Infrastructure as Code
// Terraform/Pulumi defines URL once, injects to services
resource "aws_ssm_parameter" "api_url" {
  name  = "/app/api-url"
  value = "https://api.example.com/v2"
}
// All services read from SSM, configuration is infrastructure

3. Data Transformation Duplication:

The same data transformation logic in different parts of the system.

transformation-dry
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
// ❌ VIOLATION: Same transformation in multiple places
 
// In API handler
app.get('/users/:id', async (req, res) => {
  const user = await db.findUser(req.params.id);
  res.json({
    id: user.id,
    name: `${user.firstName} ${user.lastName}`,
    email: user.email,
    avatarUrl: user.avatarId 
      ? `https://cdn.example.com/${user.avatarId}`
      : null,
  });
});
 
// In GraphQL resolver
const resolvers = {
  User: {
    fullName: (user) => `${user.firstName} ${user.lastName}`,
    avatarUrl: (user) => user.avatarId 
      ? `https://cdn.example.com/${user.avatarId}`
      : null,
  }
};
 
// In event handler
function onUserCreated(user: User) {
  sendWelcomeEmail({
    name: `${user.firstName} ${user.lastName}`,
    // ... 
  });
}
 
 
// ✅ FIX: Domain model with transformation methods
 
class UserPresenter {
  constructor(private user: User) {}
 
  get fullName(): string {
    return `${this.user.firstName} ${this.user.lastName}`;
  }
 
  get avatarUrl(): string | null {
    return this.user.avatarId 
      ? `${CDN_BASE_URL}/${this.user.avatarId}`
      : null;
  }
 
  toApiResponse(): UserApiResponse {
    return {
      id: this.user.id,
      name: this.fullName,
      email: this.user.email,
      avatarUrl: this.avatarUrl,
    };
  }
}
 
// Or value object approach
class UserDisplayInfo {
  readonly fullName: string;
  readonly avatarUrl: string | null;
 
  static fromUser(user: User): UserDisplayInfo {
    return new UserDisplayInfo(
      `${user.firstName} ${user.lastName}`,
      user.avatarId ? `${CDN_BASE_URL}/${user.avatarId}` : null
    );
  }
}

Cross-System Duplication

The most challenging DRY violations span system boundaries: frontend and backend, different microservices, different programming languages, or code and documentation. These require specialized solutions.

Client-Server Schema Duplication:

When clients and servers need to agree on data shapes, the schema is often defined twice—once in each codebase. This leads to runtime errors when they drift.

cross-system-dry
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// ❌ VIOLATION: Schema defined in both client and server
 
// Server: TypeScript types
interface User {
  id: string;
  email: string;
  name: string;
  role: 'admin' | 'user' | 'guest';
}
 
// Client: Separate TypeScript types (duplicated!)
interface User {
  id: string;
  email: string;
  name: string;
  role: 'admin' | 'user' | 'guest';
}
 
// When server adds a field, client might not know
// When server changes 'role' values, client breaks at runtime
 
 
// ✅ FIX: Shared schema with code generation
 
// Approach 1: OpenAPI / Swagger
// Define schema once in openapi.yaml
```yaml
components:
  schemas:
    User:
      type: object
      properties:
        id: { type: string }
        email: { type: string }
        name: { type: string }
        role: { type: string, enum: [admin, user, guest] }
```
 
// Generate server types: npx openapi-typescript
// Generate client types: same command on client
 
 
// Approach 2: Protocol Buffers / gRPC
// Define schema once in .proto file
message User {
  string id = 1;
  string email = 2;
  string name = 3;
  Role role = 4;
}
enum Role {
  ADMIN = 0;
  USER = 1;
  GUEST = 2;
}
// Generate for all languages: protoc --ts_out=... --go_out=...
 
 
// Approach 3: GraphQL
// Schema is the contract, shared by definition
type User {
  id: ID!
  email: String!
  name: String!
  role: Role!
}
enum Role { ADMIN USER GUEST }
// Codegen generates types for both: npx graphql-codegen
 
 
// Approach 4: Shared Package (Monorepo)
// packages/shared-types/User.ts
export interface User { ... }
 
// Server imports: import { User } from '@acme/shared-types'
// Client imports: import { User } from '@acme/shared-types'
// One definition, multiple consumers

Code-Documentation Duplication:

When documentation describes what code does, it becomes stale as code changes. The knowledge is duplicated between the code and the docs.

Documentation DRY Strategies

•Generate docs from code — Tools like TSDoc, Javadoc, Sphinx extract documentation from code comments. The code is the source of truth.
•Generate docs from schemas — OpenAPI, GraphQL generates API documentation. Schema defines the contract; docs are derived.
•Executable documentation — Tests that serve as examples. The test suite becomes living documentation that can't drift.
•Literate programming — Code embedded in documentation (Jupyter, RMarkdown). The documentation is the code.
•Design documents for why, not what — Keep docs focused on rationale. The code itself documents what; docs explain why.

The Single Source Principle

For cross-system duplication, always ask: "What is the single source of truth?" Everything else should be generated, derived, or reference that source. If you must duplicate manually, create processes to detect and alert on drift.

Refactoring to DRY

Refactoring to eliminate DRY violations requires care. Aggressive refactoring can introduce bugs or create the wrong abstraction. Here's a systematic approach:

Step 1: Confirm Essential Duplication

Before refactoring, verify that the duplication is essential (same knowledge) not coincidental (similar code, different concepts). Apply the tests from the previous page.

Step 2: Identify the Boundaries

Find all instances of the duplicated knowledge. Search for:

Literal matches (exact code)
Semantic matches (same meaning, different code)
Related code that might have the same bug or need the same change

Step 3: Design the Abstraction

Design the abstraction that will hold the single source of truth. Ask:

What is the minimal interface needed?
Where should it live? (What module "owns" this knowledge?)
How will consumers access it?

refactoring-process
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
// REFACTORING PROCESS: Step by Step
 
// BEFORE: Duplication across order service and invoice service
 
// order-service/pricing.ts
function calculateOrderTotal(items: OrderItem[]): OrderPricing {
  const subtotal = items.reduce((sum, i) => sum + i.price * i.quantity, 0);
  const discount = subtotal >= 100 ? subtotal * 0.10 :
                   subtotal >= 50  ? subtotal * 0.05 : 0;
  const afterDiscount = subtotal - discount;
  const tax = afterDiscount * 0.08;
  const shipping = afterDiscount >= 100 ? 0 : 9.99;
  return { subtotal, discount, tax, shipping, total: afterDiscount + tax + shipping };
}
 
// invoice-service/invoice-generator.ts
function generateInvoice(items: InvoiceItem[]): Invoice {
  const subtotal = items.reduce((sum, i) => sum + i.price * i.quantity, 0);
  const discount = subtotal >= 100 ? subtotal * 0.10 :
                   subtotal >= 50  ? subtotal * 0.05 : 0;
  const afterDiscount = subtotal - discount;
  const tax = afterDiscount * 0.08;
  const shipping = afterDiscount >= 100 ? 0 : 9.99;
  // ... generate invoice with these values
}
 
 
// STEP 1: Confirm Essential Duplication
// ✓ Same discount tiers (business rule)
// ✓ Same tax rate (tax law)
// ✓ Same free shipping threshold (business policy)
// Verdict: Essential - this is ONE piece of knowledge
 
 
// STEP 2: Identify Boundaries
// Found in: order-service, invoice-service
// Potentially also in: cart-preview, admin-pricing-tool
 
 
// STEP 3: Design Abstraction
// Create a Pricing module that owns these rules
 
// AFTER: Shared pricing domain
 
// @acme/pricing/src/calculator.ts
export const PRICING_CONFIG = {
  DISCOUNT_TIERS: [
    { threshold: 100, percent: 10 },
    { threshold: 50, percent: 5 },
  ],
  TAX_RATE: 0.08,
  FREE_SHIPPING_THRESHOLD: 100,
  STANDARD_SHIPPING: 9.99,
} as const;
 
export interface LineItem {
  price: number;
  quantity: number;
}
 
export interface PricingBreakdown {
  subtotal: number;
  discount: number;
  afterDiscount: number;
  tax: number;
  shipping: number;
  total: number;
}
 
export function calculatePricing(items: LineItem[]): PricingBreakdown {
  const subtotal = items.reduce((sum, i) => sum + i.price * i.quantity, 0);
  
  const discountTier = PRICING_CONFIG.DISCOUNT_TIERS
    .find(tier => subtotal >= tier.threshold);
  const discount = discountTier 
    ? subtotal * (discountTier.percent / 100) 
    : 0;
    
  const afterDiscount = subtotal - discount;
  const tax = afterDiscount * PRICING_CONFIG.TAX_RATE;
  
  const shipping = afterDiscount >= PRICING_CONFIG.FREE_SHIPPING_THRESHOLD 
    ? 0 
    : PRICING_CONFIG.STANDARD_SHIPPING;
    
  return {
    subtotal,
    discount,
    afterDiscount,
    tax,
    shipping,
    total: afterDiscount + tax + shipping,
  };
}
 
 
// STEP 4: Migrate Consumers
// order-service/pricing.ts
import { calculatePricing, PricingBreakdown } from '@acme/pricing';
 
function calculateOrderTotal(items: OrderItem[]): PricingBreakdown {
  return calculatePricing(items.map(i => ({ 
    price: i.price, 
    quantity: i.quantity 
  })));
}
 
// invoice-service/invoice-generator.ts
import { calculatePricing } from '@acme/pricing';
 
function generateInvoice(items: InvoiceItem[]): Invoice {
  const pricing = calculatePricing(items);
  // ... generate invoice with pricing
}

Step 4: Migrate Consumers

Replace duplicated code with calls to the new abstraction. Do this incrementally:

Migrate one consumer at a time
Run tests after each migration
Look for subtle differences that might indicate the duplication was coincidental

Step 5: Delete the Duplicates

Once all consumers use the new abstraction, delete the original duplicated code. The abstraction is now the single source of truth.

Step 6: Document Ownership

Make clear who "owns" the new abstraction. When the knowledge needs to change, where should that change be made?

The Rule of Three

A practical heuristic for DRY: wait for three occurrences before abstracting. This "Rule of Three" guards against premature abstraction.

Why Three?

Once — You write code to solve a problem. No duplication yet.
Twice — You write similar code for a similar problem. Is it the same knowledge? Maybe, but maybe not. You don't have enough data points. Copy the code (tolerate duplication temporarily).
Thrice — Similar code appears a third time. Now you have a pattern. Three data points give confidence that:
- The duplication is real, not coincidental
- The abstraction you create will likely be right
- The investment in abstraction will pay off

Abstract Too Early

•Creates abstractions based on one or two examples
•Guesses at what the pattern might be
•Builds flexibility you may never need
•Often creates the "wrong abstraction"
•Results in tortured abstractions when cases diverge

Wait for Three

•Has evidence of a real pattern
•Understands the actual variations
•Creates focused, accurate abstractions
•Avoids premature generalization
•Easier to get the abstraction right

Exceptions to the Rule:

Like all heuristics, the Rule of Three has exceptions:

Obviously shared knowledge — If the knowledge is clearly the same (a business rule, a calculation, a validation), you might abstract immediately. The "three" is about uncertainty; if you're certain, proceed.
Cross-system duplication — When duplication spans boundaries (frontend/backend), abstract sooner. The cost of inconsistency is higher.
High-change areas — If the duplicated code changes frequently, the synchronization cost is high. Abstract sooner.
Security/compliance code — Critical code should have a single authoritative implementation. Don't wait for three vulnerabilities.

The Second Time Is the Hardest

When you encounter code for the second time, you must resist the urge to abstract immediately. You're seeing a pattern, but you don't yet know if it's real. Mark the duplication (TODO comment) and move on. The third occurrence will confirm whether to abstract.

Summary: DRY Violations and Fixes

We've covered practical techniques for identifying and fixing DRY violations. Let's consolidate the key takeaways:

Key Takeaways

•Classify violations by scope and nature — Understanding whether duplication is literal, parametric, structural, or semantic guides the choice of fix.
•Use automated tools for detection, but apply human judgment — Tools find syntactic duplication; humans must evaluate whether it represents the same knowledge.
•Choose the right abstraction mechanism — Extract function, parameterize, extract class, use composition/inheritance, or generate code—each has its place.
•Address common patterns systematically — Validation, configuration, data transformation, and cross-system duplication have established solutions.
•For cross-system duplication, establish a single source — Use schemas (OpenAPI, GraphQL, Protobuf) or shared packages to avoid duplicate definitions.
•Refactor systematically — Confirm essential duplication, identify boundaries, design the abstraction, migrate incrementally, then delete duplicates.
•Apply the Rule of Three — Wait for three occurrences before abstracting to avoid premature generalization, unless the knowledge is obviously shared.

What's next:

Now that we know how to fix DRY violations, we must address the flip side: when DRY goes too far. Overzealous application of DRY creates its own problems—inappropriate coupling, complex indirection, and the dreaded "wrong abstraction." The next page explores these pitfalls and how to balance DRY with other engineering concerns.

Page Complete

You now have practical skills for detecting DRY violations and systematically fixing them. You can classify violations, select appropriate remediation strategies, and refactor without creating harmful abstractions. Next, we'll explore the dangers of DRY taken too far.