Loading content...
Understanding the DRY principle and the distinction between knowledge and code duplication is essential—but knowing how to identify and fix violations in real codebases is where the principle becomes actionable.
DRY violations manifest in many forms, from obvious copy-pasted blocks to subtle structural repetitions that span multiple files or even systems. Fixing them requires choosing the right abstraction mechanism and applying it at the right time. This page provides a systematic approach to both identification and remediation.
By the end of this page, you will be able to identify the major categories of DRY violations, apply appropriate remediation strategies for each category, and refactor duplication without creating the wrong abstraction. You'll learn both the techniques and the judgment to apply them wisely.
DRY violations can be classified by their scope and nature. Understanding the category helps select the appropriate fix.
By Scope:
By Nature:
| Category | Example | Detection | Fix Complexity |
|---|---|---|---|
| Literal (exact copy) | Copy-pasted function | Trivial (tools) | Low |
| Parametric | Same logic, different constants | Easy (tools) | Low-Medium |
| Structural | Same pattern, different types | Medium (review) | Medium |
| Semantic | Same rule, different expressions | Hard (requires understanding) | High |
| Cross-system | Logic in frontend and backend | Very Hard | High |
Not all violations deserve equal attention. Prioritize fixing duplication that: (1) represents frequently-changing knowledge, (2) has caused bugs from missed updates, (3) appears in critical business logic, or (4) spans system boundaries where inconsistency hurts users.
Detection strategies range from automated tools to human insight. Different types of duplication require different detection approaches.
Automated Detection:
Static analysis tools can identify literal and parametric duplication:
Limitations: Automated tools only catch syntactic duplication. They miss semantic duplication entirely and may flag coincidental similarity as false positives.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
// DETECTION PATTERN 1: Magic Numbers// Tools can flag repeated numeric literals // ❌ Violation: "30" appears in multiple placesconst sessionTimeout = 30 * 60 * 1000; // 30 minutesconst cacheExpiry = 30 * 60 * 1000; // Also 30 minutes?const maxRetryWait = 30 * 1000; // 30 seconds? // Are these the SAME "30" or different concepts?// - Session timeout: security policy// - Cache expiry: performance tuning// - Retry wait: infrastructure config// Detection found them, but analysis is needed! // ✅ Fix: Named constants make knowledge explicitconst SESSION_TIMEOUT_MINUTES = 30;const CACHE_EXPIRY_MINUTES = 30; // Same value, different concept!const MAX_RETRY_WAIT_SECONDS = 30; // If session policy changes to 60 min, only SESSION changes// If cache tuning adjusts, only CACHE changes // DETECTION PATTERN 2: Repeated String Literals// Especially dangerous with URLs, keys, messages // ❌ Violation: Same error message in multiple placesthrow new Error("Invalid user ID format");// ... elsewhere in codebase ...throw new Error("Invalid user ID format"); // When message copy changes, will both be updated? // ✅ Fix: Centralized error messagesconst ErrorMessages = { INVALID_USER_ID: "Invalid user ID format", // ... other messages} as const; throw new Error(ErrorMessages.INVALID_USER_ID); // DETECTION PATTERN 3: Structural Similarity// These have the same SHAPE even with different contents // ❌ Violation: Same pattern repeatedclass UserRepository { async findById(id: string): Promise<User | null> { return this.db.query("SELECT * FROM users WHERE id = ?", [id]); } async findByEmail(email: string): Promise<User | null> { return this.db.query("SELECT * FROM users WHERE email = ?", [email]); } async findByUsername(username: string): Promise<User | null> { return this.db.query("SELECT * FROM users WHERE username = ?", [username]); }} // Same pattern: field differs, query structure identical// Is this essential duplication? Depends on whether the// pattern represents shared knowledge (yes) or just// coincidentally similar operations (debatable)Human Detection:
For semantic duplication, human insight is essential:
A key human detection signal: "I've seen this before." If you have a vague feeling that similar code exists elsewhere, investigate. That intuition often reveals knowledge duplication.
When fixing a bug, search the codebase for similar code. If you find related logic that might have the same bug, you've discovered duplication. This "shotgun bug" pattern—where one conceptual bug requires fixes in multiple places—is a reliable duplication indicator.
Once a DRY violation is identified and confirmed as essential (not coincidental), the next step is choosing the right abstraction to eliminate it. The choice depends on the nature of the duplication.
Strategy Menu:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
// STRATEGY SELECTION BASED ON DUPLICATION TYPE // ────────────────────────────────────────────────────────────// SCENARIO 1: Repeated Calculation (Extract Function)// ──────────────────────────────────────────────────────────── // Before:function processOrder(order: Order) { const subtotal = order.items.reduce((sum, i) => sum + i.price * i.qty, 0); const tax = subtotal * 0.08; const total = subtotal + tax; // ... use total} function displayCart(items: CartItem[]) { const subtotal = items.reduce((sum, i) => sum + i.price * i.qty, 0); const tax = subtotal * 0.08; const total = subtotal + tax; // ... display total} // After: Extract functionfunction calculateOrderTotal(items: { price: number; qty: number }[]): { subtotal: number; tax: number; total: number;} { const subtotal = items.reduce((sum, i) => sum + i.price * i.qty, 0); const tax = subtotal * TAX_RATE; // And extract constant! return { subtotal, tax, total: subtotal + tax };} // ────────────────────────────────────────────────────────────// SCENARIO 2: Structural Pattern (Extract Class)// ──────────────────────────────────────────────────────────── // Before: HTTP call pattern repeated across servicesasync function fetchUsers(): Promise<User[]> { try { const response = await fetch('/api/users'); if (!response.ok) throw new Error('User fetch failed'); return response.json(); } catch (error) { logger.error('Fetch users error', error); throw error; }} async function fetchOrders(): Promise<Order[]> { try { const response = await fetch('/api/orders'); if (!response.ok) throw new Error('Order fetch failed'); return response.json(); } catch (error) { logger.error('Fetch orders error', error); throw error; }} // After: Extract class with parameterizationclass ApiClient { async fetch<T>(endpoint: string, entityName: string): Promise<T> { try { const response = await fetch(endpoint); if (!response.ok) throw new Error(`${entityName} fetch failed`); return response.json(); } catch (error) { logger.error(`Fetch ${entityName} error`, error); throw error; } }} // ────────────────────────────────────────────────────────────// SCENARIO 3: Varying Logic (Strategy Pattern)// ──────────────────────────────────────────────────────────── // Before: Discount calculation with variationsfunction applyDiscount(order: Order, type: string): number { // Shared: get base price const base = calculateSubtotal(order); // Varying: discount logic if (type === 'percentage') { return base * (1 - order.discountValue / 100); } else if (type === 'fixed') { return Math.max(0, base - order.discountValue); } else if (type === 'buyOneGetOne') { // Complex BOGO logic return calculateBOGOPrice(order); } return base;} // After: Strategy patterninterface DiscountStrategy { apply(basePrice: number, params: DiscountParams): number;} class PercentageDiscount implements DiscountStrategy { apply(base: number, params: DiscountParams): number { return base * (1 - params.value / 100); }} class FixedDiscount implements DiscountStrategy { apply(base: number, params: DiscountParams): number { return Math.max(0, base - params.value); }} // New discounts added by creating new classes, not modifying codeLet's examine the most frequently encountered DRY violations in professional codebases and their standard remedies.
1. Validation Duplication:
Validation logic is one of the most commonly duplicated types of knowledge. The same business rules are often enforced in frontend, backend, API layer, and database.
1234567891011121314151617181920212223242526272829303132
// ❌ VIOLATION: Validation rules in multiple places // Frontend (React)function validateEmail(email: string): boolean { const pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/; return pattern.test(email) && email.length <= 254;} // Backend (Express)function validateEmailRequest(email: string): boolean { const pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/; return pattern.test(email) && email.length <= 254;} // Database (Check constraint duplicates the 254 limit)// API Docs (Describes the same rules in prose) // ✅ FIX: Single source of truth for validation schemas // Shared validation schema (could be JSON Schema, Zod, Yup, etc.)const emailSchema = z.string() .email("Invalid email format") .max(254, "Email too long"); // Generated/derived everywhere else:// - Frontend: import { emailSchema } from '@shared/schemas'// - Backend: same schema, same library// - API Docs: generated from schema (OpenAPI integration)// - Database: could generate CHECK constraint from schema // Now the rule exists in ONE place2. Configuration Duplication:
The same configuration values scattered across files, environments, and services.
1234567891011121314151617181920212223242526272829303132333435363738
// ❌ VIOLATION: Same API URL in multiple places // service-a/config.tsconst API_URL = "https://api.example.com/v2"; // service-b/config.ts const API_BASE = "https://api.example.com/v2"; // frontend/.envREACT_APP_API=https://api.example.com/v2 // tests/fixtures.tsconst testApi = "https://api.example.com/v2"; // ✅ FIX: Centralized configuration with derivation // Option 1: Environment variables from single source// .env.template (source of truth)API_URL=https://api.example.com/v2 // All services read from environmentconst apiUrl = process.env.API_URL; // Option 2: Configuration service// config-service exposes typed configurationconst config = await ConfigService.get();const apiUrl = config.api.baseUrl; // Single source // Option 3: Infrastructure as Code// Terraform/Pulumi defines URL once, injects to servicesresource "aws_ssm_parameter" "api_url" { name = "/app/api-url" value = "https://api.example.com/v2"}// All services read from SSM, configuration is infrastructure3. Data Transformation Duplication:
The same data transformation logic in different parts of the system.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
// ❌ VIOLATION: Same transformation in multiple places // In API handlerapp.get('/users/:id', async (req, res) => { const user = await db.findUser(req.params.id); res.json({ id: user.id, name: `${user.firstName} ${user.lastName}`, email: user.email, avatarUrl: user.avatarId ? `https://cdn.example.com/${user.avatarId}` : null, });}); // In GraphQL resolverconst resolvers = { User: { fullName: (user) => `${user.firstName} ${user.lastName}`, avatarUrl: (user) => user.avatarId ? `https://cdn.example.com/${user.avatarId}` : null, }}; // In event handlerfunction onUserCreated(user: User) { sendWelcomeEmail({ name: `${user.firstName} ${user.lastName}`, // ... });} // ✅ FIX: Domain model with transformation methods class UserPresenter { constructor(private user: User) {} get fullName(): string { return `${this.user.firstName} ${this.user.lastName}`; } get avatarUrl(): string | null { return this.user.avatarId ? `${CDN_BASE_URL}/${this.user.avatarId}` : null; } toApiResponse(): UserApiResponse { return { id: this.user.id, name: this.fullName, email: this.user.email, avatarUrl: this.avatarUrl, }; }} // Or value object approachclass UserDisplayInfo { readonly fullName: string; readonly avatarUrl: string | null; static fromUser(user: User): UserDisplayInfo { return new UserDisplayInfo( `${user.firstName} ${user.lastName}`, user.avatarId ? `${CDN_BASE_URL}/${user.avatarId}` : null ); }}The most challenging DRY violations span system boundaries: frontend and backend, different microservices, different programming languages, or code and documentation. These require specialized solutions.
Client-Server Schema Duplication:
When clients and servers need to agree on data shapes, the schema is often defined twice—once in each codebase. This leads to runtime errors when they drift.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
// ❌ VIOLATION: Schema defined in both client and server // Server: TypeScript typesinterface User { id: string; email: string; name: string; role: 'admin' | 'user' | 'guest';} // Client: Separate TypeScript types (duplicated!)interface User { id: string; email: string; name: string; role: 'admin' | 'user' | 'guest';} // When server adds a field, client might not know// When server changes 'role' values, client breaks at runtime // ✅ FIX: Shared schema with code generation // Approach 1: OpenAPI / Swagger// Define schema once in openapi.yaml```yamlcomponents: schemas: User: type: object properties: id: { type: string } email: { type: string } name: { type: string } role: { type: string, enum: [admin, user, guest] }``` // Generate server types: npx openapi-typescript// Generate client types: same command on client // Approach 2: Protocol Buffers / gRPC// Define schema once in .proto filemessage User { string id = 1; string email = 2; string name = 3; Role role = 4;}enum Role { ADMIN = 0; USER = 1; GUEST = 2;}// Generate for all languages: protoc --ts_out=... --go_out=... // Approach 3: GraphQL// Schema is the contract, shared by definitiontype User { id: ID! email: String! name: String! role: Role!}enum Role { ADMIN USER GUEST }// Codegen generates types for both: npx graphql-codegen // Approach 4: Shared Package (Monorepo)// packages/shared-types/User.tsexport interface User { ... } // Server imports: import { User } from '@acme/shared-types'// Client imports: import { User } from '@acme/shared-types'// One definition, multiple consumersCode-Documentation Duplication:
When documentation describes what code does, it becomes stale as code changes. The knowledge is duplicated between the code and the docs.
For cross-system duplication, always ask: "What is the single source of truth?" Everything else should be generated, derived, or reference that source. If you must duplicate manually, create processes to detect and alert on drift.
Refactoring to eliminate DRY violations requires care. Aggressive refactoring can introduce bugs or create the wrong abstraction. Here's a systematic approach:
Step 1: Confirm Essential Duplication
Before refactoring, verify that the duplication is essential (same knowledge) not coincidental (similar code, different concepts). Apply the tests from the previous page.
Step 2: Identify the Boundaries
Find all instances of the duplicated knowledge. Search for:
Step 3: Design the Abstraction
Design the abstraction that will hold the single source of truth. Ask:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
// REFACTORING PROCESS: Step by Step // BEFORE: Duplication across order service and invoice service // order-service/pricing.tsfunction calculateOrderTotal(items: OrderItem[]): OrderPricing { const subtotal = items.reduce((sum, i) => sum + i.price * i.quantity, 0); const discount = subtotal >= 100 ? subtotal * 0.10 : subtotal >= 50 ? subtotal * 0.05 : 0; const afterDiscount = subtotal - discount; const tax = afterDiscount * 0.08; const shipping = afterDiscount >= 100 ? 0 : 9.99; return { subtotal, discount, tax, shipping, total: afterDiscount + tax + shipping };} // invoice-service/invoice-generator.tsfunction generateInvoice(items: InvoiceItem[]): Invoice { const subtotal = items.reduce((sum, i) => sum + i.price * i.quantity, 0); const discount = subtotal >= 100 ? subtotal * 0.10 : subtotal >= 50 ? subtotal * 0.05 : 0; const afterDiscount = subtotal - discount; const tax = afterDiscount * 0.08; const shipping = afterDiscount >= 100 ? 0 : 9.99; // ... generate invoice with these values} // STEP 1: Confirm Essential Duplication// ✓ Same discount tiers (business rule)// ✓ Same tax rate (tax law)// ✓ Same free shipping threshold (business policy)// Verdict: Essential - this is ONE piece of knowledge // STEP 2: Identify Boundaries// Found in: order-service, invoice-service// Potentially also in: cart-preview, admin-pricing-tool // STEP 3: Design Abstraction// Create a Pricing module that owns these rules // AFTER: Shared pricing domain // @acme/pricing/src/calculator.tsexport const PRICING_CONFIG = { DISCOUNT_TIERS: [ { threshold: 100, percent: 10 }, { threshold: 50, percent: 5 }, ], TAX_RATE: 0.08, FREE_SHIPPING_THRESHOLD: 100, STANDARD_SHIPPING: 9.99,} as const; export interface LineItem { price: number; quantity: number;} export interface PricingBreakdown { subtotal: number; discount: number; afterDiscount: number; tax: number; shipping: number; total: number;} export function calculatePricing(items: LineItem[]): PricingBreakdown { const subtotal = items.reduce((sum, i) => sum + i.price * i.quantity, 0); const discountTier = PRICING_CONFIG.DISCOUNT_TIERS .find(tier => subtotal >= tier.threshold); const discount = discountTier ? subtotal * (discountTier.percent / 100) : 0; const afterDiscount = subtotal - discount; const tax = afterDiscount * PRICING_CONFIG.TAX_RATE; const shipping = afterDiscount >= PRICING_CONFIG.FREE_SHIPPING_THRESHOLD ? 0 : PRICING_CONFIG.STANDARD_SHIPPING; return { subtotal, discount, afterDiscount, tax, shipping, total: afterDiscount + tax + shipping, };} // STEP 4: Migrate Consumers// order-service/pricing.tsimport { calculatePricing, PricingBreakdown } from '@acme/pricing'; function calculateOrderTotal(items: OrderItem[]): PricingBreakdown { return calculatePricing(items.map(i => ({ price: i.price, quantity: i.quantity })));} // invoice-service/invoice-generator.tsimport { calculatePricing } from '@acme/pricing'; function generateInvoice(items: InvoiceItem[]): Invoice { const pricing = calculatePricing(items); // ... generate invoice with pricing}Step 4: Migrate Consumers
Replace duplicated code with calls to the new abstraction. Do this incrementally:
Step 5: Delete the Duplicates
Once all consumers use the new abstraction, delete the original duplicated code. The abstraction is now the single source of truth.
Step 6: Document Ownership
Make clear who "owns" the new abstraction. When the knowledge needs to change, where should that change be made?
A practical heuristic for DRY: wait for three occurrences before abstracting. This "Rule of Three" guards against premature abstraction.
Why Three?
Once — You write code to solve a problem. No duplication yet.
Twice — You write similar code for a similar problem. Is it the same knowledge? Maybe, but maybe not. You don't have enough data points. Copy the code (tolerate duplication temporarily).
Thrice — Similar code appears a third time. Now you have a pattern. Three data points give confidence that:
Exceptions to the Rule:
Like all heuristics, the Rule of Three has exceptions:
Obviously shared knowledge — If the knowledge is clearly the same (a business rule, a calculation, a validation), you might abstract immediately. The "three" is about uncertainty; if you're certain, proceed.
Cross-system duplication — When duplication spans boundaries (frontend/backend), abstract sooner. The cost of inconsistency is higher.
High-change areas — If the duplicated code changes frequently, the synchronization cost is high. Abstract sooner.
Security/compliance code — Critical code should have a single authoritative implementation. Don't wait for three vulnerabilities.
When you encounter code for the second time, you must resist the urge to abstract immediately. You're seeing a pattern, but you don't yet know if it's real. Mark the duplication (TODO comment) and move on. The third occurrence will confirm whether to abstract.
We've covered practical techniques for identifying and fixing DRY violations. Let's consolidate the key takeaways:
What's next:
Now that we know how to fix DRY violations, we must address the flip side: when DRY goes too far. Overzealous application of DRY creates its own problems—inappropriate coupling, complex indirection, and the dreaded "wrong abstraction." The next page explores these pitfalls and how to balance DRY with other engineering concerns.
You now have practical skills for detecting DRY violations and systematically fixing them. You can classify violations, select appropriate remediation strategies, and refactor without creating harmful abstractions. Next, we'll explore the dangers of DRY taken too far.