Loading learning content...
The defining characteristic of refactoring is behavior preservation. Every transformation must leave the system doing exactly what it did before—no new bugs, no missing functionality, no subtle changes in edge case handling.
This is not a philosophical commitment; it's a practical necessity. Without confidence in behavior preservation, refactoring becomes indistinguishable from 'changing stuff and hoping it works.' The anxiety of uncertainty stifles aggressive improvement, leaving teams afraid to touch legacy code.
This page provides the techniques and disciplines that convert hope into confidence. With these practices, you can refactor boldly—knowing that your verification systems will catch mistakes before they escape into production.
By the end of this page, you will understand how to establish a safety net before refactoring begins, write characterization tests for legacy code, apply incremental refactoring with continuous verification, use version control strategically, and adopt the disciplines that make behavior preservation reliable.
Tests are the primary mechanism for behavior verification during refactoring. Without tests, you're relying on manual verification—a process that's slow, incomplete, and unreliable. With good tests, every refactoring step can be verified in seconds.
The ideal refactoring scenario:
The reality: Often, especially with legacy code, tests are inadequate or absent. This doesn't mean refactoring is impossible—it means preparation work is required.
Measuring readiness:
Before refactoring critical code, assess your test coverage honestly:
The level of test coverage should scale with the risk of the refactoring. Renaming a local variable needs less coverage than restructuring a payment processor.
High code coverage doesn't guarantee behavior coverage. Tests that execute lines without meaningful assertions provide false confidence. Review tests qualitatively: do they actually verify the behaviors you're modifying?
When tests don't exist, characterization tests bridge the gap. Introduced by Michael Feathers in Working Effectively with Legacy Code, characterization tests document what code actually does—not what it should do, but what it currently does.
The characterization test philosophy:
"A characterization test is a test that characterizes the actual behavior of a piece of code. There's no 'golden' or 'mocked' behavior. You literally run the code and write down what happens."
This is behavior preservation at its most literal: capture current behavior, then ensure refactoring doesn't change it.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
// LEGACY CODE: What does this actually do?// We're not sure, documentation is missing, original author left years agofunction calculateDiscount(customer: any, order: any): number { let discount = 0; if (customer.type === 'premium') { discount = 0.15; } else if (customer.type === 'business') { discount = 0.10; } if (order.total > 500) { discount += 0.05; } if (customer.memberSince < new Date('2020-01-01')) { discount += 0.02; } return Math.min(discount, 0.25);} // STEP 1: Write tests that capture current behavior// Run the code with various inputs and record actual outputs describe('calculateDiscount characterization tests', () => { // Capture premium customer behavior test('premium customer with small order', () => { const customer = { type: 'premium', memberSince: new Date('2021-01-01') }; const order = { total: 100 }; // First run: let the test fail to see actual value // expect(calculateDiscount(customer, order)).toBe(???); // After observing output, document it: expect(calculateDiscount(customer, order)).toBe(0.15); }); test('premium customer with large order', () => { const customer = { type: 'premium', memberSince: new Date('2021-01-01') }; const order = { total: 600 }; expect(calculateDiscount(customer, order)).toBe(0.20); // 0.15 + 0.05 }); test('premium long-time customer with large order', () => { const customer = { type: 'premium', memberSince: new Date('2018-01-01') }; const order = { total: 600 }; expect(calculateDiscount(customer, order)).toBe(0.22); // 0.15 + 0.05 + 0.02 }); test('discount caps at 25%', () => { const customer = { type: 'premium', memberSince: new Date('2015-01-01') }; const order = { total: 1000 }; // Would be 0.15 + 0.05 + 0.02 = 0.22... actually no cap hit // Let's try business + premium... wait, it checks type directly // This characterization reveals that 25% cap is hard to reach expect(calculateDiscount(customer, order)).toBe(0.22); }); test('business customer', () => { const customer = { type: 'business', memberSince: new Date('2021-01-01') }; const order = { total: 100 }; expect(calculateDiscount(customer, order)).toBe(0.10); }); test('regular customer gets no base discount', () => { const customer = { type: 'regular', memberSince: new Date('2021-01-01') }; const order = { total: 100 }; expect(calculateDiscount(customer, order)).toBe(0); }); test('unknown customer type treated as regular', () => { const customer = { type: 'unknown', memberSince: new Date('2021-01-01') }; const order = { total: 100 }; expect(calculateDiscount(customer, order)).toBe(0); }); // Edge cases discovered during characterization test('null customer type is treated as regular', () => { const customer = { type: null, memberSince: new Date('2021-01-01') }; const order = { total: 100 }; expect(calculateDiscount(customer, order)).toBe(0); }); test('boundary: exactly 500 does not trigger large order bonus', () => { const customer = { type: 'regular', memberSince: new Date('2021-01-01') }; const order = { total: 500 }; expect(calculateDiscount(customer, order)).toBe(0); }); test('boundary: 500.01 triggers large order bonus', () => { const customer = { type: 'regular', memberSince: new Date('2021-01-01') }; const order = { total: 500.01 }; expect(calculateDiscount(customer, order)).toBe(0.05); });}); // NOW: These tests protect behavior during refactoring// Any change that alters behavior will cause a test failureThe characterization test workflow:
Important: Characterization tests document actual behavior, including bugs. If you discover the code handles null incorrectly, the characterization test asserts the incorrect behavior. Fixing the bug is a separate (behavior-changing) step, done after refactoring is complete.
Writing characterization tests is one of the best ways to understand legacy code. Each test forces you to think about inputs, outputs, and edge cases. The resulting test suite serves as executable documentation of actual behavior.
The core discipline of safe refactoring is taking small steps with continuous verification. Each step should be tiny enough that if something goes wrong, the cause is obvious and the fix is simple.
The refactoring rhythm:
If tests fail after a step, you know exactly which change caused the failure. The fix is either: correct the mistake, or revert the single change.
| Refactoring | Safe Step | Risky Big Step |
|---|---|---|
| Extract Method | Create method, copy code, call method, then delete original code (3 commits) | Create method, delete original code in one step |
| Rename | Use IDE rename refactoring (atomic) | Manual find-and-replace across files |
| Move Method | Copy to destination, add delegation from source, update callers incrementally, remove delegation | Cut from source, paste to destination, fix all callers at once |
| Extract Interface | Create interface, implement in class, change one caller at a time, repeat | Create interface, change all callers simultaneously |
| Replace Conditional with Polymorphism | Create base class/interface, one subclass at a time, move one case at a time | Create full hierarchy and switch statement replacement together |
1234567891011121314151617181920212223242526272829303132333435363738394041
// GOAL: Move calculateShipping from OrderService to Order // STEP 1: Current state (tests pass ✓)class OrderService { calculateShipping(order: Order): number { if (order.total > 100) return 0; return order.items.length * 2.5; }} // STEP 2: Copy method to target (tests still pass ✓)class Order { items: OrderItem[]; total: number; // New method, copy of original calculateShipping(): number { if (this.total > 100) return 0; return this.items.length * 2.5; }} // STEP 3: Delegate from original to new (tests still pass ✓)class OrderService { calculateShipping(order: Order): number { return order.calculateShipping(); // Delegate }} // STEP 4: Update one caller at a time (tests pass after each ✓)// Before:const shipping = orderService.calculateShipping(order);// After:const shipping = order.calculateShipping(); // STEP 5: After all callers updated, remove original method (tests pass ✓)class OrderService { // calculateShipping removed - no longer needed} // Each step is verified; mistakes are immediately visibleThe delegation step (Step 3) allows both old and new call paths to work simultaneously. This is especially valuable when callers are in different codebases or deployed systems—you can migrate callers incrementally over time.
Version control is your second line of defense after tests. Proper version control practices make refactoring reversible and reviewable.
Commit practices for refactoring:
123456789101112131415161718192021222324
# Start a refactoring branchgit checkout -b refactor/extract-payment-processing # After each successful step, commitgit add .git commit -m "Extract PaymentValidator from OrderProcessor" # Run tests before moving onnpm test # Next stepgit commit -m "Move payment validation to PaymentValidator.validate()" # If something goes wrong, easy revert options:git revert HEAD # Revert last commit, keep historygit reset --soft HEAD~1 # Undo last commit, keep changes stagedgit reset --hard HEAD~1 # Undo last commit, discard changes # Before merging, clean up history if neededgit rebase -i main # Final merge (or PR)git checkout maingit merge refactor/extract-payment-processingThe revert recovery pattern:
When refactoring goes wrong and tests fail unexpectedly, resist the urge to 'just fix it forward.' Instead:
This discipline prevents cascading errors where fixing one problem introduces another, which introduces another, until you're lost in an unfamiliar codebase of your own making.
When a refactoring step breaks tests, there's a strong temptation to make a quick fix and continue. This often leads to a chain of quick fixes that leave you far from a working state. Revert first, think second, try again with clear understanding.
Modern IDEs provide automated refactoring tools that execute transformations reliably. These tools understand the code's structure and update all references correctly—something manual refactoring often fails to do.
Why prefer IDE refactoring:
| Operation | Manual Risk | IDE Benefit | Typical Shortcut |
|---|---|---|---|
| Rename Symbol | Missing references, case sensitivity issues | Updates all references across project | F2 / Shift+F6 |
| Extract Method | Incorrect parameter handling, missed variables | Correctly identifies parameters and return values | Ctrl+Alt+M |
| Extract Variable | Choosing wrong extraction scope | Respects scope and suggests replacements | Ctrl+Alt+V |
| Inline Variable/Method | Missing usages, complex expressions | Replaces all usages correctly | Ctrl+Alt+N |
| Change Signature | Missing callers, incorrect parameter passing | Updates all call sites with default values | Ctrl+F6 |
| Move to File/Class | Forgotten imports, incomplete moves | Updates imports and references | F6 |
| Extract Interface | Missing methods, incomplete implementation | Generates interface with selected methods | Ctrl+Shift+I |
When manual refactoring is necessary:
Some refactorings aren't well-supported by IDE tools:
For these cases, combine IDE assistance where possible with careful manual changes and comprehensive testing.
IDE refactoring tools are good but not perfect. Always run tests after IDE-assisted refactoring. Review the changes in version control diff to ensure the IDE did what you expected. Some edge cases (dynamic references, reflection) may be missed.
For larger refactorings that can't be completed in a single commit cycle, Branch by Abstraction provides a safe migration path. Instead of modifying existing code directly, you build the new implementation alongside the old, then gradually migrate.
The pattern:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
// SCENARIO: Replace legacy payment processor with new one// Constraint: Can't break production during migration // STEP 1: Create abstraction covering both implementationsinterface PaymentProcessor { processPayment(amount: number, cardToken: string): Promise<PaymentResult>; refund(transactionId: string, amount: number): Promise<RefundResult>;} // STEP 2: Wrap existing implementationclass LegacyPaymentProcessor implements PaymentProcessor { private readonly legacyService: LegacyPaymentService; async processPayment(amount: number, cardToken: string): Promise<PaymentResult> { // Adapt legacy method signature to new interface const legacyResult = await this.legacyService.charge(cardToken, amount * 100); // cents return { success: legacyResult.status === 'OK', transactionId: legacyResult.txn_id, message: legacyResult.message }; } async refund(transactionId: string, amount: number): Promise<RefundResult> { // Similar adaptation }} // STEP 3: Build new implementation with same interfaceclass StripePaymentProcessor implements PaymentProcessor { constructor(private stripe: Stripe) {} async processPayment(amount: number, cardToken: string): Promise<PaymentResult> { const intent = await this.stripe.paymentIntents.create({ amount: Math.round(amount * 100), currency: 'usd', payment_method: cardToken, confirm: true }); return { success: intent.status === 'succeeded', transactionId: intent.id, message: intent.status }; } async refund(transactionId: string, amount: number): Promise<RefundResult> { // Stripe implementation }} // STEP 4: Feature flag controls which implementation is usedclass PaymentProcessorFactory { static create(config: Config): PaymentProcessor { if (config.featureFlags.useStripeProcessor) { return new StripePaymentProcessor(new Stripe(config.stripeKey)); } return new LegacyPaymentProcessor(new LegacyPaymentService(config.legacyUrl)); }} // STEP 5: Gradual rollout// - 0% traffic to new implementation (baseline)// - 1% traffic to new implementation (testing)// - 10% traffic (monitoring)// - 50% traffic (confidence building)// - 100% traffic (migration complete) // STEP 6: Remove old implementation once new is fully deployed// LegacyPaymentProcessor can be safely deleted// Factory simplified to always return StripeBenefits of Branch by Abstraction:
This pattern is essential for large-scale refactorings in production systems where downtime or risk must be minimized.
Advanced applications of Branch by Abstraction include 'dark launching'—running both implementations simultaneously and comparing outputs. Discrepancies indicate behavior differences to investigate before cutover.
Beyond automated tests, certain semantic properties require explicit attention during refactoring. Tests verify specific inputs and outputs, but some behaviors span across many scenarios.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
// EXAMPLE 1: Accidentally changing lazy to eager// BEFORE: Lazy evaluationfunction getUsers(): Iterable<User> { return { *[Symbol.iterator]() { for (const record of database.scan('users')) { yield transform(record); // Only processes records as consumed } } };} // AFTER (WRONG): Accidentally loads all users into memoryfunction getUsers(): User[] { return database.scan('users').map(transform); // Eager - loads everything} // EXAMPLE 2: Accidentally changing error behavior// BEFORE: Throws on first invalid itemfunction processItems(items: Item[]): void { for (const item of items) { if (!item.isValid()) { throw new Error(`Invalid item: ${item.id}`); } process(item); }} // AFTER (WRONG): Collects all errors - behavior changefunction processItems(items: Item[]): void { const errors = items.filter(item => !item.isValid()); if (errors.length > 0) { throw new Error(`Invalid items: ${errors.map(e => e.id).join(', ')}`); } items.forEach(process);} // EXAMPLE 3: Accidentally changing order sensitivity// BEFORE: Processes in insertion orderfunction getConfigValue(key: string): string | undefined { for (const source of this.sources) { // First match wins const value = source.get(key); if (value !== undefined) return value; } return undefined;} // AFTER (WRONG): Map iteration order might differfunction getConfigValue(key: string): string | undefined { const allValues = new Map<ConfigSource, string>(); for (const source of this.sources) { const value = source.get(key); if (value !== undefined) allValues.set(source, value); } return allValues.values().next().value; // Order not guaranteed!}Many semantic changes (lazy vs eager, order sensitivity, thread safety) aren't caught by typical unit tests. Review refactored code specifically for these properties. When in doubt, add characterization tests that explicitly verify the semantic property.
Maintaining behavior during refactoring requires a combination of safety nets, verification disciplines, and careful attention to semantic properties. Here's the complete framework:
You now understand the disciplines that make refactoring safe and reliable. With these practices, you can improve code structure with confidence that behavior remains unchanged. Next, we'll explore continuous improvement—how to make refactoring a sustainable, ongoing practice rather than an occasional heroic effort.