Loading learning content...
Every line of production code carries an implicit contract with its users: "I will behave in this specific way." These contracts extend far beyond documented APIs—they include error handling quirks, timing behaviors, edge case responses, and countless other details that users (whether human or other code) have come to depend on.
When refactoring from inheritance to composition, your primary obligation is preserving these contracts. A refactoring that changes behavior—even fixing apparent "bugs"—can cause production incidents. Users who depended on the old behavior now find their assumptions violated.
Refactoring is structure change, not behavior change. If you discover bugs or suboptimal behavior during refactoring, document them but DO NOT fix them as part of the refactoring. Fix them in separate, targeted changes after the refactoring is complete.
By the end of this page, you will understand techniques for ensuring behavioral equivalence during refactoring: contract documentation, characterization testing, shadow execution, behavioral diff detection, and strategies for handling discovered edge cases.
Before we can preserve behavior, we must understand what behavior actually means in the context of software systems. Behavior is multi-dimensional:
These are the documented, intentional behaviors:
These are undocumented behaviors that users nonetheless depend on:
These arise from the interaction of multiple components:
1234567891011121314151617181920212223242526272829303132333435
// Examples of behavioral contracts (explicit and implicit) class OrderProcessor { // EXPLICIT: Documented to throw on invalid order /** * Processes the order and returns confirmation number. * @throws InvalidOrderException if order validation fails */ public String processOrder(Order order) throws InvalidOrderException { // Implementation... } // IMPLICIT: Users depend on specific exception type // Even though not documented, changing to a different exception // would break callers who catch InvalidOrderException specifically // IMPLICIT: Users may depend on empty string vs null public String getOrderNotes(Order order) { if (order.getNotes() == null) { return ""; // Returning null would break callers doing .length() } return order.getNotes(); } // EMERGENT: The specific ordering of these calls matters // because downstream systems expect this sequence public void fulfillOrder(Order order) { inventoryService.reserve(order); // Must happen first paymentService.charge(order); // Must happen second shippingService.schedule(order); // Must happen third notificationService.notify(order); // Must happen last // Reordering would violate contracts even if each call succeeds }}"With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody." — Hyrum Wright, Google. This is why preserving ALL observable behavior, not just documented behavior, is critical.
Characterization tests (also called Golden Master tests or Approval tests) capture the current behavior of a system without asserting that the behavior is correct—only that it is consistent.
Traditional unit tests assert expected behavior:
assertEquals(expected, actual);
Characterization tests assert consistent behavior:
assertEquals(previouslyRecordedOutput, actual);
The previously recorded output becomes the "golden master"—any deviation triggers a test failure, requiring explicit acknowledgment of the behavior change.
To build characterization tests for refactoring:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128
// Characterization test framework for refactoring validation public class NotificationBehaviorCharacterization { private static final Path GOLDEN_MASTER_DIR = Paths.get("src/test/resources/golden-masters/notifications"); @Test void emailNotification_standardCase() throws Exception { EmailNotification email = createEmailNotification( "user@example.com", "Test Subject", "This is the body content" ); // Capture all observable behaviors CharacterizationResult result = new CharacterizationResult(); result.formattedContent = email.formatContent(); result.validationResult = captureValidation(email); result.sideEffects = captureSideEffects(() -> email.send()); result.thrownException = captureThrownException(() -> email.send()); // Compare against golden master assertMatchesGoldenMaster("email-standard", result); } @Test void emailNotification_nullSubject() throws Exception { EmailNotification email = createEmailNotification( "user@example.com", null, // Edge case: null subject "Body content" ); CharacterizationResult result = captureAllBehaviors(email); assertMatchesGoldenMaster("email-null-subject", result); } @Test void emailNotification_emptyBody() throws Exception { EmailNotification email = createEmailNotification( "user@example.com", "Subject", "" // Edge case: empty body ); CharacterizationResult result = captureAllBehaviors(email); assertMatchesGoldenMaster("email-empty-body", result); } @Test void emailNotification_invalidRecipient() throws Exception { EmailNotification email = createEmailNotification( "not-an-email", // Edge case: invalid recipient "Subject", "Body" ); CharacterizationResult result = captureAllBehaviors(email); // This might capture an exception, which is still valid behavior assertMatchesGoldenMaster("email-invalid-recipient", result); } // Helper to capture all observable behaviors private CharacterizationResult captureAllBehaviors(EmailNotification email) { CharacterizationResult result = new CharacterizationResult(); // Capture formatting behavior try { result.formattedContent = email.formatContent(); } catch (Exception e) { result.formatException = e.getClass().getName() + ": " + e.getMessage(); } // Capture validation behavior try { result.validationPassed = email.validate(); result.validationMessages = email.getValidationMessages(); } catch (Exception e) { result.validationException = e.getClass().getName() + ": " + e.getMessage(); } // Capture send behavior (mocked) try (MockedStatic<SmtpClient> mockedSmtp = mockStatic(SmtpClient.class)) { List<Object[]> calls = new ArrayList<>(); mockedSmtp.when(() -> SmtpClient.send(any(), any(), any())) .thenAnswer(inv -> { calls.add(inv.getArguments()); return "MSG-ID-123"; }); email.send(); result.smtpCalls = calls; } catch (Exception e) { result.sendException = e.getClass().getName() + ": " + e.getMessage(); } return result; } private void assertMatchesGoldenMaster(String testName, CharacterizationResult result) throws Exception { Path goldenPath = GOLDEN_MASTER_DIR.resolve(testName + ".json"); String resultJson = toJson(result); if (Files.exists(goldenPath)) { // Compare with existing golden master String goldenJson = Files.readString(goldenPath); assertEquals(goldenJson, resultJson, "Behavior changed! If this is intentional, update golden master."); } else { // First run: create golden master Files.writeString(goldenPath, resultJson); System.out.println("Created new golden master: " + goldenPath); } } @Data static class CharacterizationResult { String formattedContent; String formatException; boolean validationPassed; List<String> validationMessages; String validationException; List<Object[]> smtpCalls; String sendException; }}Use production logs and monitoring data to identify real-world input patterns. The most valuable characterization tests use inputs that actually occur in production, not just theoretical edge cases.
Shadow execution (also called dark launching or parallel running) is a production verification technique where both old and new implementations run simultaneously, but only the old implementation's results are used. This catches discrepancies on real production traffic without impacting users.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147
// Shadow execution wrapper for safe production verification public class ShadowExecutionNotificationService implements NotificationService { private final NotificationService legacyService; // Old inheritance-based private final NotificationService newService; // New composition-based private final ShadowMetrics metrics; private final Logger logger; // Configuration private final double shadowTrafficPercentage; // e.g., 0.1 = 10% private final boolean logDiscrepancies; public ShadowExecutionNotificationService( NotificationService legacyService, NotificationService newService, ShadowMetrics metrics, double shadowTrafficPercentage) { this.legacyService = legacyService; this.newService = newService; this.metrics = metrics; this.shadowTrafficPercentage = shadowTrafficPercentage; this.logDiscrepancies = true; this.logger = LoggerFactory.getLogger(getClass()); } @Override public SendResult send(Notification notification) { // Always execute legacy path (this is production) SendResult legacyResult; Exception legacyException = null; long legacyStart = System.nanoTime(); try { legacyResult = legacyService.send(notification); } catch (Exception e) { legacyException = e; legacyResult = null; } long legacyDuration = System.nanoTime() - legacyStart; // Conditionally execute shadow path if (shouldRunShadow()) { runShadowAsync(notification, legacyResult, legacyException, legacyDuration); } // Return legacy result (production behavior unchanged) if (legacyException != null) { throw new RuntimeException(legacyException); } return legacyResult; } private boolean shouldRunShadow() { return Math.random() < shadowTrafficPercentage; } private void runShadowAsync( Notification notification, SendResult legacyResult, Exception legacyException, long legacyDuration) { CompletableFuture.runAsync(() -> { SendResult newResult = null; Exception newException = null; long newStart = System.nanoTime(); try { newResult = newService.send(notification); } catch (Exception e) { newException = e; } long newDuration = System.nanoTime() - newStart; // Compare and record compareBehaviors( notification, legacyResult, legacyException, legacyDuration, newResult, newException, newDuration ); }); } private void compareBehaviors( Notification notification, SendResult legacyResult, Exception legacyException, long legacyDuration, SendResult newResult, Exception newException, long newDuration) { boolean outcomeMatches = compareOutcomes( legacyResult, legacyException, newResult, newException ); // Record metrics metrics.recordShadowExecution( notification.getType(), outcomeMatches, legacyDuration, newDuration ); if (!outcomeMatches && logDiscrepancies) { logger.warn("Shadow execution discrepancy detected. " + "Notification: {}, Legacy: {}/{}, New: {}/{}", notification.getId(), legacyResult, legacyException, newResult, newException); // Store for later analysis metrics.recordDiscrepancy(new DiscrepancyRecord( notification, legacyResult, legacyException, newResult, newException )); } } private boolean compareOutcomes( SendResult legacy, Exception legacyEx, SendResult newR, Exception newEx) { // Both exceptions if (legacyEx != null && newEx != null) { return compareExceptions(legacyEx, newEx); } // One exception, one success = mismatch if ((legacyEx != null) != (newEx != null)) { return false; } // Both success: compare results return compareResults(legacy, newR); } private boolean compareExceptions(Exception legacy, Exception newEx) { // Same exception type? return legacy.getClass().equals(newEx.getClass()); } private boolean compareResults(SendResult legacy, SendResult newR) { // Compare relevant fields (not things like timestamps) return legacy.isSuccess() == newR.isSuccess() && legacy.getRecipient().equals(newR.getRecipient()); }}Monitor shadow execution to build confidence before switching:
| Metric | Description | Target Before Cutover |
|---|---|---|
| Match Rate | Percentage of requests with identical outcomes | 99.9% |
| Latency Comparison | New vs. legacy execution time | New ≤ Legacy + 10% |
| Exception Parity | Same exceptions thrown for same inputs | 100% |
| Discrepancy Categories | Classification of mismatches by type | All explained |
| Shadow Volume | Percentage of traffic shadow-executed | Gradually increase to 100% |
Be careful with side effects! If your notification service actually sends emails, the shadow path should use a mock or dev endpoint. Shadow execution should never cause duplicate side effects in production.
When shadow execution reveals discrepancies, you need systematic techniques to identify the root cause and determine whether the difference is a bug in the new implementation, a bug in the old implementation, or acceptable variance.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
// Automated discrepancy analysis system public class DiscrepancyAnalyzer { public AnalysisReport analyze(List<DiscrepancyRecord> discrepancies) { AnalysisReport report = new AnalysisReport(); // Categorize discrepancies Map<DiscrepancyCategory, List<DiscrepancyRecord>> categorized = discrepancies.stream() .collect(Collectors.groupingBy(this::categorize)); // Analyze each category for (var entry : categorized.entrySet()) { CategoryAnalysis analysis = analyzeCategory(entry.getKey(), entry.getValue()); report.addCategoryAnalysis(analysis); } // Generate recommendations report.setRecommendations(generateRecommendations(categorized)); return report; } private DiscrepancyCategory categorize(DiscrepancyRecord record) { // Exception vs. success mismatch if (record.legacySucceeded() != record.newSucceeded()) { return record.legacySucceeded() ? DiscrepancyCategory.NEW_FAILS_WHERE_LEGACY_SUCCEEDS : DiscrepancyCategory.NEW_SUCCEEDS_WHERE_LEGACY_FAILS; } // Both threw exceptions but different types if (record.legacyException() != null && !record.legacyException().getClass() .equals(record.newException().getClass())) { return DiscrepancyCategory.EXCEPTION_TYPE_MISMATCH; } // Both succeeded but different results if (record.legacyResult() != null && record.newResult() != null) { return DiscrepancyCategory.RESULT_VALUE_MISMATCH; } return DiscrepancyCategory.OTHER; } private CategoryAnalysis analyzeCategory( DiscrepancyCategory category, List<DiscrepancyRecord> records) { CategoryAnalysis analysis = new CategoryAnalysis(category); analysis.setCount(records.size()); analysis.setExamples(records.stream().limit(5).toList()); // Look for patterns analysis.setInputPatterns(findInputPatterns(records)); analysis.setTimePatterns(findTimePatterns(records)); // Severity assessment analysis.setSeverity(assessSeverity(category, records)); return analysis; } private List<String> findInputPatterns(List<DiscrepancyRecord> records) { // Identify common characteristics of inputs that cause discrepancies List<String> patterns = new ArrayList<>(); long nullRecipients = records.stream() .filter(r -> r.getNotification().getRecipient() == null) .count(); if (nullRecipients > records.size() * 0.5) { patterns.add("50%+ have null recipient"); } long emptyContent = records.stream() .filter(r -> r.getNotification().getContent().isEmpty()) .count(); if (emptyContent > records.size() * 0.5) { patterns.add("50%+ have empty content"); } // Add more pattern detection as needed return patterns; } enum DiscrepancyCategory { NEW_FAILS_WHERE_LEGACY_SUCCEEDS, // Critical: new impl more strict NEW_SUCCEEDS_WHERE_LEGACY_FAILS, // Often OK: new impl more lenient EXCEPTION_TYPE_MISMATCH, // May matter for catch blocks RESULT_VALUE_MISMATCH, // Investigate case by case OTHER }}For each discrepancy category, you must decide the disposition:
| Disposition | When to Use | Action Required |
|---|---|---|
| Fix New | New implementation has a bug | Modify new code to match legacy |
| Document Legacy Bug | Legacy has a known bug being preserved | Add comment explaining the preserved bug |
| Accept Variance | Difference is acceptable (e.g., timing) | Adjust comparison logic to allow this |
| Upgrade Behavior | Both should change | Create follow-up ticket for post-refactor fix |
| Investigate Further | Root cause unclear | Gather more data before deciding |
Refactoring often surfaces edge cases that the original developers didn't consciously design for—behavior that emerged from specific implementation details rather than intentional design.
When you discover an edge case during refactoring, work through these questions:
123456789101112131415161718192021222324252627
// Documenting preserved edge case behavior public class SmsFormatter implements ContentFormatter { /* * EDGE CASE DOCUMENTATION: * * Legacy Behavior: When content contains a null character (\0), the SMS * gateway truncates at that point. We preserve this behavior even though * it's technically a gateway bug, because: * 1. Some integrations might rely on this for content termination * 2. The new formatter should be behaviorally identical during refactoring * * Post-Refactoring: Consider sanitizing null characters. See JIRA-4521. */ @Override public String format(String rawContent, FormattingContext context) { // Preserve legacy null-character truncation behavior int nullIndex = rawContent.indexOf('\0'); if (nullIndex >= 0) { rawContent = rawContent.substring(0, nullIndex); } // Normal formatting continues... return truncateToSmsLimit(rawContent); }}Every discovered edge case should become a test case. This serves multiple purposes:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// Explicit edge case tests from discoveries during refactoring class SmsFormatterEdgeCaseTests { @Test @DisplayName("Edge Case: Null character in content causes truncation") void format_contentWithNullCharacter_truncatesAtNullChar() { // DISCOVERED: 2024-01-15 during refactoring // ROOT CAUSE: SMS gateway behavior, not intentional design // DECISION: Preserve for backward compatibility SmsFormatter formatter = new SmsFormatter(" STOP"); String contentWithNull = "Hello\0World"; String result = formatter.format(contentWithNull, context()); assertEquals("Hello STOP", result); // "World" is truncated } @Test @DisplayName("Edge Case: Unicode emoji counts as 2 characters for length limit") void format_contentWithEmoji_countsEmojiAsMultipleChars() { // DISCOVERED: 2024-01-16 during shadow execution // ROOT CAUSE: GSM-7 encoding limitation // DECISION: Preserve; carrier limitation, not our bug SmsFormatter formatter = new SmsFormatter(""); String contentWithEmoji = "Test 😀 message"; // 😀 = 2 chars String result = formatter.format(contentWithEmoji, context()); // Emoji should count toward the 160 limit as 2 characters assertTrue(contentWithEmoji.length() < 160); // Looks short // But effective length includes emoji multi-char counting } @Test @DisplayName("Edge Case: Leading whitespace in content is preserved") void format_contentWithLeadingWhitespace_preservesWhitespace() { // DISCOVERED: 2024-01-17 during code review // ROOT CAUSE: Intentional design (some clients use for alignment) // DECISION: Document and preserve SmsFormatter formatter = new SmsFormatter(""); String content = " Indented message"; String result = formatter.format(content, context()); assertTrue(result.startsWith(" ")); }}Maintain a living document cataloging all discovered edge cases. This becomes invaluable for future maintainers and helps when similar refactoring is done elsewhere in the system.
When breaking apart an inheritance hierarchy into composed components, you're introducing new boundaries. These boundaries need contract tests to ensure components continue to work together correctly.
For each interface created during extraction, write tests that verify any implementation meets the interface's behavioral contract:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091
// Contract tests that any DeliveryChannel implementation must pass public abstract class DeliveryChannelContractTest { // Subclasses provide the implementation under test protected abstract DeliveryChannel createChannel(); @Test void deliver_validNotification_returnsSuccessResult() { DeliveryChannel channel = createChannel(); FormattedNotification notification = createValidNotification(); DeliveryResult result = channel.deliver(notification); assertTrue(result.success()); assertNotNull(result.messageId()); assertNull(result.error()); } @Test void deliver_validNotification_doesNotThrow() { DeliveryChannel channel = createChannel(); FormattedNotification notification = createValidNotification(); assertDoesNotThrow(() -> channel.deliver(notification)); } @Test void deliver_nullNotification_throwsNullPointerException() { DeliveryChannel channel = createChannel(); assertThrows(NullPointerException.class, () -> channel.deliver(null)); } @Test void supportsRecipient_validForChannel_returnsTrue() { DeliveryChannel channel = createChannel(); String validRecipient = getValidRecipientForChannel(); assertTrue(channel.supportsRecipient(validRecipient)); } @Test void supportsRecipient_invalidForChannel_returnsFalse() { DeliveryChannel channel = createChannel(); String invalidRecipient = getInvalidRecipientForChannel(); assertFalse(channel.supportsRecipient(invalidRecipient)); } // Abstract methods for subclass configuration protected abstract FormattedNotification createValidNotification(); protected abstract String getValidRecipientForChannel(); protected abstract String getInvalidRecipientForChannel();} // Concrete test for SMTP channelclass SmtpDeliveryChannelContractTest extends DeliveryChannelContractTest { private SmtpClient mockClient; @BeforeEach void setUp() { mockClient = mock(SmtpClient.class); when(mockClient.send(any(), any(), any())).thenReturn("MSG-123"); } @Override protected DeliveryChannel createChannel() { return new SmtpDeliveryChannel(mockClient); } @Override protected FormattedNotification createValidNotification() { return new FormattedNotification( "Content", Map.of("to", "user@example.com", "subject", "Test") ); } @Override protected String getValidRecipientForChannel() { return "user@example.com"; // Valid email } @Override protected String getInvalidRecipientForChannel() { return "+1234567890"; // Phone number, not email }}Test that composed components work correctly together:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
// Integration tests for composed notification system class NotificationSystemIntegrationTest { @Test void fullPipeline_validEmail_sendsFormattedContent() { // Compose the full system TemplateEngine templateEngine = new InMemoryTemplateEngine(); templateEngine.register("email-template.html", "<html>{{content}}</html>"); SmtpClient mockSmtp = mock(SmtpClient.class); when(mockSmtp.send(any(), any(), any())).thenReturn("MSG-123"); ContentFormatter formatter = new HtmlEmailFormatter(templateEngine); DeliveryChannel channel = new SmtpDeliveryChannel(mockSmtp); NotificationValidator validator = new CompositeValidator(); RetryPolicy retryPolicy = new NoRetryPolicy(); // Create notification using factory EmailNotification notification = new EmailNotification( "user@example.com", "Hello, World!", "Test Subject", formatter, channel, validator, retryPolicy ); // Execute notification.send(); // Verify the formatted content reached the SMTP client verify(mockSmtp).send( eq("user@example.com"), eq("Test Subject"), eq("<html>Hello, World!</html>") ); } @Test void fullPipeline_validationFails_doesNotSend() { // Compose with a validator that rejects NotificationValidator rejectingValidator = new NotificationValidator() { @Override public ValidationResult validate(NotificationData data) { return new ValidationResult(false, List.of("Invalid recipient")); } }; SmtpClient mockSmtp = mock(SmtpClient.class); EmailNotification notification = new EmailNotification( "invalid", "Content", "Subject", new HtmlEmailFormatter(new DummyTemplateEngine()), new SmtpDeliveryChannel(mockSmtp), rejectingValidator, new NoRetryPolicy() ); assertThrows(InvalidNotificationException.class, notification::send); verifyNoInteractions(mockSmtp); // Should not reach delivery }}The final page covers testing the refactored design—comprehensive testing strategies that validate not just behavioral correctness, but also that the new composition-based design achieves its goals of flexibility, maintainability, and extensibility.