System Design (HLD)Google Drive / Dropbox

Designing Cloud File Storage Systems

LevelAdvanced

Duration90 mins

TopicGoogle Drive / Dropbox

3 / 6

Conflict Resolution

The Conflict Problem

Conflict resolution is the most challenging aspect of distributed file synchronization. When the same file is modified on multiple devices simultaneously, the system faces a fundamental question: Which version is correct?

Unlike traditional client-server systems where the server always has the latest version, cloud storage systems allow concurrent modifications across many devices. A user might edit a document on their laptop, while a colleague edits the same document on their desktop, while their phone is offline with its own cached version. When these devices sync, conflicts emerge.

The Goal of Conflict Resolution:

Never lose user data. When conflicts occur, preserve all work and help users reconcile differences with minimal confusion.

Learning Objectives

By the end of this page, you'll understand: (1) How to reliably detect conflicts in distributed systems, (2) Different resolution strategies and their trade-offs, (3) Automatic vs manual resolution and when to use each, (4) Operational Transformation (OT) and CRDTs for real-time collaboration, and (5) How production systems handle edge cases.

Conflict Detection

Before resolving conflicts, we must detect them. Conflict detection requires tracking the history of modifications and identifying when changes diverge from a common ancestor.

Version Vectors:

The most robust approach uses version vectors (vector clocks). Each device maintains a version number, and the combined vector represents the full history:

Version Vector Example:

Device A creates file:    {A:1}
Device A modifies:        {A:2}
Device A syncs to B:      B receives {A:2}
Device B modifies:        {A:2, B:1}

Now if Device A modifies offline:  {A:3}
And Device B modifies offline:     {A:2, B:2}

These versions CONFLICT because:
- A:3 has A:3 > A:2 but doesn't know about B:2
- A:2,B:2 has B:2 > B:1 but doesn't know about A:3
- Neither "happens before" the other

Conflict Detection Approaches

•Server Revision Numbers — Simple, used by most cloud services. Each upload must specify the parent revision. If parent doesn't match server's current, conflict detected. Simple but loses concurrent ordering information.
•Version Vectors — Track modification history across all devices. Can determine exactly which changes conflict and which can be merged. More complex but enables smart merging.
•Content Hashing — Compare hashes: if content differs from expected, conflict exists. Doesn't track history but simple to implement. Used as a verification layer.
•Merkle Trees — Hash tree over file/folder structure. Efficient detection of which parts differ. Used by git and some sync protocols for efficient comparison.

Conflict Detection Method Comparison
Method	Complexity	Information Captured	Use Case
Server Revision	Low	Linear sequence only	Simple file storage (Dropbox model)
Version Vectors	Medium	Full causality graph	Collaborative editing, complex merge
Content Hash	Low	Current state only	Verification, deduplication
Merkle Tree	High	Structural differences	Large directory sync, git-like systems

conflict_detection.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Server revision-based conflict detection
interface FileVersion {
    path: string;
    revision: number;      // Server-assigned, monotonically increasing
    contentHash: string;   // SHA-256 of file content
    parentRevision: number; // The revision this was based on
}
 
class ConflictDetector {
    // Called when client attempts to upload
    detectConflict(
        serverVersion: FileVersion,
        clientUpload: { parentRevision: number; contentHash: string }
    ): 'no_conflict' | 'conflict' | 'no_change' {
        
        // Client is up-to-date in sync with server
        if (clientUpload.parentRevision === serverVersion.revision) {
            if (clientUpload.contentHash === serverVersion.contentHash) {
                return 'no_change';  // Same content, no upload needed
            }
            return 'no_conflict';    // Normal upload proceeds
        }
        
        // Client's parent doesn't match server's current revision
        // Server has been modified since client's last sync
        if (clientUpload.parentRevision < serverVersion.revision) {
            // But content is same? No real conflict
            if (clientUpload.contentHash === serverVersion.contentHash) {
                return 'no_change';
            }
            return 'conflict';  // True conflict: different content
        }
        
        // Client claims newer parent than server knows?
        // This shouldn't happen - indicates client bug
        throw new Error('Invalid state: client revision ahead of server');
    }
}

The Same-Content Optimization

Notice the 'no_change' case: if two users make the same changes independently, there's technically a conflict (divergent edits) but effectively no conflict (same result). Smart systems detect this and skip unnecessary conflict resolution, improving user experience.

Resolution Strategies

Once a conflict is detected, the system must resolve it. Different strategies have different trade-offs between simplicity, user experience, and data preservation.

Last Write Wins (LWW)

•Simple: use timestamp, keep newest
•Automatic: no user intervention
•DANGER: Silently loses data!
•Clock sync issues cause wrong winner
•Never use for important files
•Acceptable only for ephemeral data

Copy-Both Strategy

•Create conflict copies of all versions
•User sees: 'file.txt' and 'file (conflict).txt'
•No data loss—user decides
•Can accumulate many copies over time
•User must manually merge
•Industry standard for file sync

Conflict Copy Naming Conventions:

Dropbox:     "report.docx (John's conflicted copy 2024-01-15).docx"
Google:      "report.docx" stays, older becomes "report.docx.backup"
OneDrive:    "report-LAPTOP-JOHN.docx"
iCloud:      "report 2.docx"

Best Practice: Include user/device and timestamp
Pattern: "filename (Device's conflicted copy YYYY-MM-DD HH:mm).ext"

Advanced Resolution Strategies

•Automatic Merge (for supported formats) — For text files, attempt 3-way merge using common ancestor. If no conflicts at line level, merge succeeds. Git does this successfully for code.
•Semantic Merge — Understand file format and merge at semantic level. XML/JSON can be merged by structure. Spreadsheets can merge by cell. Much more complex to implement.
•Operational Transformation (OT) — Transform operations rather than state. Used by Google Docs for real-time collaboration. Each operation includes enough context to replay in any order.
•CRDTs (Conflict-free Replicated Data Types) — Design data structures where all possible merges are valid. No conflicts by design. Used by Figma, Apple Notes. Requires application redesign.
•User-Assisted Resolution — Present diff to user, let them choose per-section. Best user experience but requires user attention. Common for document editors.

Resolution Strategy Selection Guide
File Type	Recommended Strategy	Rationale
Binary files (images, videos)	Copy-Both	Cannot be automatically merged
Text files (code, notes)	3-way merge, fallback to copy-both	Often mergeable, fall back if conflicts
Documents (docx, pdf)	Copy-Both with visual diff	Format-specific merge too complex
Realtime docs (Google Docs)	OT or CRDT	Designed for concurrent editing
Database files	Application-specific	Requires semantic understanding
Config files	3-way merge + validation	Merge then validate syntax

Three-Way Merge Deep Dive

Three-way merge is the most sophisticated automatic resolution technique for text files. It uses the common ancestor to determine what each version changed, then combines non-conflicting changes.

How Three-Way Merge Works:

Common Ancestor (Base):    Version A (Local):       Version B (Remote):
┌─────────────────────┐    ┌─────────────────────┐   ┌─────────────────────┐
│ Line 1: Hello       │    │ Line 1: Hello       │   │ Line 1: Hello       │
│ Line 2: World       │    │ Line 2: World       │   │ Line 2: Everyone    │ ← B changed
│ Line 3: Foo         │    │ Line 3: Bar         │ ← │ Line 3: Foo         │  A changed
│ Line 4: End         │    │ Line 4: End         │   │ Line 4: End         │
└─────────────────────┘    └─────────────────────┘   └─────────────────────┘

Merge Analysis:
- Line 1: Same in all → keep
- Line 2: A=Base, B changed → use B's version
- Line 3: A changed, B=Base → use A's version
- Line 4: Same in all → keep

Merged Result:
┌─────────────────────┐
│ Line 1: Hello       │
│ Line 2: Everyone    │ ← From B
│ Line 3: Bar         │ ← From A  
│ Line 4: End         │
└─────────────────────┘

✓ Merge successful! Both changes preserved.

When Three-Way Merge Fails (True Conflict):

Base:              Version A:          Version B:
│ Line 2: World    │ Line 2: Earth     │ Line 2: Everyone
                         ↓                    ↓
                   A changed to Earth  B changed to Everyone

BOTH changed the same line differently!

Result: CONFLICT - cannot auto-merge

Typical output:
┌─────────────────────┐
│ Line 1: Hello       │
│ <<<<<<< LOCAL       │
│ Line 2: Earth       │
│ =======             │
│ Line 2: Everyone    │
│ >>>>>>> REMOTE      │
│ Line 3: Bar         │
└─────────────────────┘

User must manually resolve the conflict markers.

three_way_merge.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
// Simplified three-way merge algorithm
type MergeResult = 
    | { status: 'success'; content: string[] }
    | { status: 'conflict'; conflicts: ConflictRegion[] };
 
interface ConflictRegion {
    lineStart: number;
    localContent: string[];
    remoteContent: string[];
}
 
function threeWayMerge(
    base: string[],
    local: string[],
    remote: string[]
): MergeResult {
    // Compute differences
    const localDiff = computeDiff(base, local);   // What local changed
    const remoteDiff = computeDiff(base, remote); // What remote changed
    
    const result: string[] = [];
    const conflicts: ConflictRegion[] = [];
    
    let baseIdx = 0, localIdx = 0, remoteIdx = 0;
    
    while (baseIdx < base.length || localIdx < local.length || remoteIdx < remote.length) {
        const localChanged = localDiff.hasChange(baseIdx);
        const remoteChanged = remoteDiff.hasChange(baseIdx);
        
        if (!localChanged && !remoteChanged) {
            // Neither changed - keep base
            result.push(base[baseIdx]);
            baseIdx++; localIdx++; remoteIdx++;
        } 
        else if (localChanged && !remoteChanged) {
            // Only local changed - use local
            const change = localDiff.getChange(baseIdx);
            result.push(...change.newLines);
            baseIdx += change.baseLines;
            localIdx += change.newLines.length;
            remoteIdx += change.baseLines;
        }
        else if (!localChanged && remoteChanged) {
            // Only remote changed - use remote  
            const change = remoteDiff.getChange(baseIdx);
            result.push(...change.newLines);
            baseIdx += change.baseLines;
            localIdx += change.baseLines;
            remoteIdx += change.newLines.length;
        }
        else {
            // BOTH changed - check if same change
            const localChange = localDiff.getChange(baseIdx);
            const remoteChange = remoteDiff.getChange(baseIdx);
            
            if (arraysEqual(localChange.newLines, remoteChange.newLines)) {
                // Same change - no conflict
                result.push(...localChange.newLines);
            } else {
                // Different changes - TRUE CONFLICT
                conflicts.push({
                    lineStart: result.length,
                    localContent: localChange.newLines,
                    remoteContent: remoteChange.newLines,
                });
                // Add conflict markers
                result.push('<<<<<<< LOCAL');
                result.push(...localChange.newLines);
                result.push('=======');
                result.push(...remoteChange.newLines);
                result.push('>>>>>>> REMOTE');
            }
            baseIdx += Math.max(localChange.baseLines, remoteChange.baseLines);
            localIdx += localChange.newLines.length;
            remoteIdx += remoteChange.newLines.length;
        }
    }
    
    return conflicts.length > 0 
        ? { status: 'conflict', conflicts }
        : { status: 'success', content: result };
}

Semantic Conflict Detection

Simple three-way merge doesn't understand semantics. If Alice adds a function foo() and Bob also adds a different foo() in different locations, merge succeeds but code won't compile! Advanced systems can detect such semantic conflicts through AST analysis or test execution.

Operational Transformation and CRDTs

For real-time collaborative editing (Google Docs, Figma, Notion), file-level conflict resolution isn't sufficient. We need character-level or element-level conflict handling that works in real-time. Two approaches dominate: Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDTs).

Operational Transformation (OT):

OT works by transforming operations against concurrent operations to preserve intent:

Initial: "HELLO"

User A: INSERT 'X' at position 1 → "HXELLO"
User B: INSERT 'Y' at position 3 → "HELYLO"

Problem: If we apply A's operation to B's result:
  "HELYLO" + INSERT 'X' at position 1 → "HXELYLO"
  
But B expected position 3 to be after 'L'!

OT Solution: Transform B's operation against A's:
  A inserted at 1, which is before B's position 3
  So B's position shifts: 3 + 1 = 4
  New B operation: INSERT 'Y' at position 4
  
Apply in sequence:
  "HELLO" → "HXELLO" → "HXELYLO"
  
Both insertions preserved at intended locations! ✓

OT Characteristics

•Operations transformed, not data
•Requires central server for ordering
•Proven at scale (Google Docs)
•Complex transformation functions
•Hard to prove correctness
•Best for text editing

CRDT Characteristics

•Data structures designed for merge
•Truly decentralized, no server needed
•Mathematical guarantee: no conflicts
•Higher storage/bandwidth overhead
•Easier to prove correctness
•Works for any data type

CRDT Example - G-Counter (Grow-only Counter):

// Each node maintains its own counter
Node A: {A: 5, B: 3}  // A incremented 5 times, knows B did 3
Node B: {A: 2, B: 7}  // B incremented 7 times, knows A did 2

Merge: Take maximum of each component
       {A: max(5,2), B: max(3,7)} = {A: 5, B: 7}

Total count: 5 + 7 = 12

This works because:
- Each node only increments its own component
- Merge is commutative: merge(A,B) = merge(B,A)
- Merge is idempotent: merge(A,A) = A
- Merge is associative: merge(A,merge(B,C)) = merge(merge(A,B),C)

Common CRDT Types:

CRDT	Use Case	How It Works
G-Counter	Likes, views	Each node tracks own increments, merge = max
PN-Counter	Balance, inventory	Two G-Counters (positive, negative)
LWW-Register	Last-write-wins cell	Timestamp determines winner
OR-Set	Sets with add/remove	Each element tagged with unique ID
RGA	Collaborative text	Characters have position IDs, never deleted
Automerge	JSON documents	Combines multiple CRDTs for complex structures

The CRDT Trade-off

CRDTs guarantee conflict-free merge but don't guarantee user intent. If two users both delete the same paragraph and add different replacements, both additions survive even though the intent was replacement. This 'zombie data' problem requires careful UX design to handle gracefully.

Conflict Prevention Strategies

The best conflict is one that never happens. Production systems employ various strategies to prevent conflicts before they occur, reducing the need for complex resolution.

Prevention Strategies

•File Locking — Acquire exclusive lock before editing. Only one user can edit at a time. Works for desktop apps (Office). Problematic for web apps and offline use.
•Live Presence — Show who's currently viewing/editing. Users naturally avoid editing same sections. Reduces accidental conflicts through social awareness.
•Section Locking — Lock specific sections (paragraphs, cells) rather than entire file. Finer granularity allows more concurrent editing.
•Automatic Partitioning — Design data model to reduce conflicts. Each user edits their own record. Conflicts only possible on shared records.
•Frequent Syncing — Sync more frequently = smaller divergence window = fewer/smaller conflicts. Real-time sync (OT/CRDT) takes this to extreme.
•Conflict Warning — Notify users when someone else is editing. 'Sarah is also editing this file' prompts coordination.

Lock Types and Trade-offs
Lock Type	Granularity	Compatibility	Deadlock Risk
Exclusive (Write)	Entire file	Blocks all other writers	Possible if nested locks
Shared (Read)	Entire file	Multiple readers OK	None
Section Lock	Paragraph/cell	Other sections editable	Low
Intent Lock	Hierarchical	Signals upcoming access	Medium
Optimistic Lock	Conceptual	Check at commit time	None (but conflicts)

Lock-Free Optimistic Approach (Most Common in Cloud):

Cloud storage systems typically don't use locks because:

Locks don't work well with offline editing
Lock state is hard to maintain across network partitions
Abandoned locks can block files forever

Instead, they use optimistic concurrency:

1. User A starts editing (no lock acquired)
2. User B also starts editing (no lock acquired)
3. User A saves → succeeds (becomes revision 5)
4. User B saves with parent=4 → CONFLICT detected
   (server has revision 5, not 4)
5. User B must resolve conflict before saving

This approach maximizes concurrency at the cost of occasional conflicts, which is the right trade-off for most collaboration scenarios.

The Lease Pattern

When locks are necessary, use leases (time-limited locks) instead of permanent locks. A lease automatically expires if not renewed, preventing the 'orphaned lock' problem when a client crashes. Typical lease duration: 30-60 seconds, renewed every 10-20 seconds during active editing.

Conflict Edge Cases

Real-world systems encounter edge cases that naive implementations handle poorly. Let's examine common edge cases and their solutions:

File System Conflicts

•Delete vs Edit — User A deletes file, User B edits same file. Resolution: Keep both actions—restore file with B's edits. Never lose active work.
•Rename Conflicts — Both users rename same file to different names. Resolution: One wins (first to sync), other creates 'file (was: oldname).txt'.
•Directory Conflicts — User A creates folder/file.txt, User B creates folder/file.txt with different content. Resolution: Regular file conflict handling.
•Move Cycles — User A moves folder X into folder Y, User B moves folder Y into folder X. Resolution: Reject one move, apply other. Cycle detection required.
•Case Sensitivity — User A on Linux creates 'File.txt', User B on Windows creates 'file.txt'. Resolution: Rename on case-insensitive systems.

edge_case_handler.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
// Edge case handlers for file system conflicts
class EdgeCaseHandler {
    
    // Delete vs Edit: User A deletes, User B edits
    handleDeleteVsEdit(
        deleteOp: DeleteOperation,
        editOp: EditOperation
    ): Resolution {
        // Never lose edits - restore file with new edits
        return {
            action: 'restore_and_apply',
            steps: [
                { op: 'create', path: editOp.path, content: editOp.newContent },
                { 
                    op: 'notify', 
                    user: deleteOp.userId,
                    message: `${editOp.userEmail} edited ${editOp.path} ` +
                                `which you deleted. File has been restored.`
                }
            ]
        };
    }
    
    // Rename to same name by different users
    handleRenameNameConflict(
        rename1: RenameOperation,
        rename2: RenameOperation
    ): Resolution {
        // Both trying to rename to 'report.docx'
        // First one wins, second gets deduplicated name
        const winner = rename1.timestamp < rename2.timestamp ? rename1 : rename2;
        const loser = winner === rename1 ? rename2 : rename1;
        
        return {
            action: 'deduplicate',
            steps: [
                { op: 'rename', from: winner.oldPath, to: winner.newPath },
                { 
                    op: 'rename', 
                    from: loser.oldPath, 
                    to: this.deduplicateName(loser.newPath)  // 'report (1).docx'
                }
            ]
        };
    }
    
    // Folder move creating cycle
    handleMoveCycle(
        moveA: MoveOperation,  // X into Y
        moveB: MoveOperation   // Y into X
    ): Resolution {
        // Detect cycle: if we apply both, neither can be root
        // Resolution: Apply first, reject second with explanation
        const first = moveA.timestamp < moveB.timestamp ? moveA : moveB;
        const second = first === moveA ? moveB : moveA;
        
        return {
            action: 'partial_apply',
            steps: [
                { op: 'move', operation: first },
                { 
                    op: 'reject', 
                    operation: second,
                    reason: 'Would create folder cycle'
                },
                {
                    op: 'notify',
                    user: second.userId,
                    message: `Could not move ${second.sourcePath} - would create ` +
                            `circular structure.`
                }
            ]
        };
    }
    
    // Cross-platform case sensitivity
    handleCaseSensitivityConflict(
        file1: string,  // 'README.md' (created on Linux)
        file2: string   // 'readme.md' (created on Windows)
    ): Resolution {
        // These are same file on Windows/macOS, different on Linux
        // Auto-rename one to avoid conflict on case-insensitive systems
        return {
            action: 'auto_rename',
            steps: [
                { 
                    op: 'rename',
                    from: file2,
                    to: this.addSuffix(file2, '-1'),  // 'readme-1.md'
                    platform: 'case_insensitive'
                },
                {
                    op: 'notify',
                    broadcast: true,
                    message: `Renamed '${file2}' to avoid conflict with '${file1}' ` +
                        `on case-insensitive systems.`
                }
            ]
        };
    }
}

The Emoji/Unicode Problem

File names with emoji or special Unicode characters can cause unexpected conflicts. Unicode normalization (NFC vs NFD) means 'café' can be encoded differently by different systems. macOS uses NFD, most other systems use NFC. Always normalize file names to a consistent form (typically NFC) on the server.

Conflict Resolution UX

Technical conflict resolution is only half the battle. Users must understand what happened and how to resolve it. Poor UX turns minor conflicts into major user frustration.

UX Best Practices

•Clear Naming — Conflict copies should clearly indicate source. Include device name and timestamp: 'report (John's MacBook 2024-01-15 14:30).docx' is clearer than 'report (1).docx'.
•Proactive Notification — Alert users immediately when conflicts occur. Don't let conflicts accumulate unnoticed for weeks. Desktop notifications + email for important files.
•Visual Diff — Provide a visual comparison tool to see differences. Highlight changes, additions, deletions. Make it easy to cherrypick changes from each version.
•One-Click Resolution — Offer quick actions: 'Keep This', 'Keep That', 'Keep Both'. Most users don't want to manually merge—give them simple choices.
•Undo/Recovery — Make it easy to undo a resolution choice. Keep the 'losing' version in revision history. Users fear making the wrong choice.
•Prevention Hints — After resolution, suggest how to avoid future conflicts: 'Consider using live collaboration for this shared document.'

Conflict Resolution Flows by Provider
Provider	Detection	Notification	Resolution
Dropbox	Server-side on upload	System notification + badge	Both files visible, manual merge
Google Drive	Real-time (OT)	In-editor banner	Auto-merge or fork document
OneDrive	Server-side on sync	Activity center	Keep both + visual diff tool
iCloud	Device sync	Finder displays both	Choose version to keep
Git	Merge/pull time	CLI output	Edit conflict markers, commit

The 'Conflicted Copy' Fatigue

Users who frequently see conflict copies often start ignoring them—they accumulate, take up space, and become noise. Combat this with: (1) Conflict folder that groups all conflicts, (2) Age-based cleanup prompts, (3) Aggressive prevention through real-time collaboration features, (4) Analytics to identify frequently-conflicting files that should be collaboration docs.

Summary: Mastering Conflict Resolution

Conflict resolution is one of the most challenging aspects of distributed file systems. Let's consolidate the key insights:

Key Takeaways

•Detection uses version tracking — Server revisions or version vectors identify when concurrent modifications create conflicts. Same-content conflicts can be optimized away.
•Copy-Both is the safe default — For binary files and complex documents, keeping both versions ensures no data loss. Last-Write-Wins loses data silently—never use it for important files.
•Three-way merge works for text — Using the common ancestor, we can automatically merge non-overlapping changes. True conflicts require user intervention.
•OT and CRDTs enable real-time collaboration — When millisecond latency is required, transform operations (OT) or design conflict-free data structures (CRDTs) eliminate the conflict problem.
•Prevention beats resolution — Locking, presence awareness, and frequent sync reduce conflicts. Optimistic concurrency with good detection is the cloud storage standard.
•UX makes or breaks the feature — Clear naming, proactive notification, visual diffs, and one-click resolution turn conflicts from frustrating to manageable.

What's Next:

With synchronization and conflict resolution covered, the next page explores Chunked Uploads—how systems handle large file uploads reliably. We'll cover resumable uploads, parallel chunking, and the protocols that enable multi-gigabyte transfers over unreliable networks.

Page Complete

You now understand conflict detection, resolution strategies, and the advanced techniques (OT, CRDTs) that enable real-time collaboration. The key principle is clear: never lose user data, and make resolution as painless as possible. Next, we tackle the challenge of reliable large file uploads.

3 / 6

Loading learning content...

System Design (HLD)Google Drive / Dropbox

Designing Cloud File Storage Systems

LevelAdvanced

Duration90 mins

TopicGoogle Drive / Dropbox

3 / 6

Conflict Resolution

The Conflict Problem

The Goal of Conflict Resolution:

Never lose user data. When conflicts occur, preserve all work and help users reconcile differences with minimal confusion.

Learning Objectives

Conflict Detection

Before resolving conflicts, we must detect them. Conflict detection requires tracking the history of modifications and identifying when changes diverge from a common ancestor.

Version Vectors:

The most robust approach uses version vectors (vector clocks). Each device maintains a version number, and the combined vector represents the full history:

Version Vector Example:

Device A creates file:    {A:1}
Device A modifies:        {A:2}
Device A syncs to B:      B receives {A:2}
Device B modifies:        {A:2, B:1}

Now if Device A modifies offline:  {A:3}
And Device B modifies offline:     {A:2, B:2}

These versions CONFLICT because:
- A:3 has A:3 > A:2 but doesn't know about B:2
- A:2,B:2 has B:2 > B:1 but doesn't know about A:3
- Neither "happens before" the other

Conflict Detection Approaches

•Server Revision Numbers — Simple, used by most cloud services. Each upload must specify the parent revision. If parent doesn't match server's current, conflict detected. Simple but loses concurrent ordering information.
•Version Vectors — Track modification history across all devices. Can determine exactly which changes conflict and which can be merged. More complex but enables smart merging.
•Content Hashing — Compare hashes: if content differs from expected, conflict exists. Doesn't track history but simple to implement. Used as a verification layer.
•Merkle Trees — Hash tree over file/folder structure. Efficient detection of which parts differ. Used by git and some sync protocols for efficient comparison.

Conflict Detection Method Comparison
Method	Complexity	Information Captured	Use Case
Server Revision	Low	Linear sequence only	Simple file storage (Dropbox model)
Version Vectors	Medium	Full causality graph	Collaborative editing, complex merge
Content Hash	Low	Current state only	Verification, deduplication
Merkle Tree	High	Structural differences	Large directory sync, git-like systems

conflict_detection.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Server revision-based conflict detection
interface FileVersion {
    path: string;
    revision: number;      // Server-assigned, monotonically increasing
    contentHash: string;   // SHA-256 of file content
    parentRevision: number; // The revision this was based on
}
 
class ConflictDetector {
    // Called when client attempts to upload
    detectConflict(
        serverVersion: FileVersion,
        clientUpload: { parentRevision: number; contentHash: string }
    ): 'no_conflict' | 'conflict' | 'no_change' {
        
        // Client is up-to-date in sync with server
        if (clientUpload.parentRevision === serverVersion.revision) {
            if (clientUpload.contentHash === serverVersion.contentHash) {
                return 'no_change';  // Same content, no upload needed
            }
            return 'no_conflict';    // Normal upload proceeds
        }
        
        // Client's parent doesn't match server's current revision
        // Server has been modified since client's last sync
        if (clientUpload.parentRevision < serverVersion.revision) {
            // But content is same? No real conflict
            if (clientUpload.contentHash === serverVersion.contentHash) {
                return 'no_change';
            }
            return 'conflict';  // True conflict: different content
        }
        
        // Client claims newer parent than server knows?
        // This shouldn't happen - indicates client bug
        throw new Error('Invalid state: client revision ahead of server');
    }
}

The Same-Content Optimization

Resolution Strategies

Once a conflict is detected, the system must resolve it. Different strategies have different trade-offs between simplicity, user experience, and data preservation.

Last Write Wins (LWW)

•Simple: use timestamp, keep newest
•Automatic: no user intervention
•DANGER: Silently loses data!
•Clock sync issues cause wrong winner
•Never use for important files
•Acceptable only for ephemeral data

Copy-Both Strategy

•Create conflict copies of all versions
•User sees: 'file.txt' and 'file (conflict).txt'
•No data loss—user decides
•Can accumulate many copies over time
•User must manually merge
•Industry standard for file sync

Conflict Copy Naming Conventions:

Dropbox:     "report.docx (John's conflicted copy 2024-01-15).docx"
Google:      "report.docx" stays, older becomes "report.docx.backup"
OneDrive:    "report-LAPTOP-JOHN.docx"
iCloud:      "report 2.docx"

Best Practice: Include user/device and timestamp
Pattern: "filename (Device's conflicted copy YYYY-MM-DD HH:mm).ext"

Advanced Resolution Strategies

•Automatic Merge (for supported formats) — For text files, attempt 3-way merge using common ancestor. If no conflicts at line level, merge succeeds. Git does this successfully for code.
•Semantic Merge — Understand file format and merge at semantic level. XML/JSON can be merged by structure. Spreadsheets can merge by cell. Much more complex to implement.
•Operational Transformation (OT) — Transform operations rather than state. Used by Google Docs for real-time collaboration. Each operation includes enough context to replay in any order.
•CRDTs (Conflict-free Replicated Data Types) — Design data structures where all possible merges are valid. No conflicts by design. Used by Figma, Apple Notes. Requires application redesign.
•User-Assisted Resolution — Present diff to user, let them choose per-section. Best user experience but requires user attention. Common for document editors.

Resolution Strategy Selection Guide
File Type	Recommended Strategy	Rationale
Binary files (images, videos)	Copy-Both	Cannot be automatically merged
Text files (code, notes)	3-way merge, fallback to copy-both	Often mergeable, fall back if conflicts
Documents (docx, pdf)	Copy-Both with visual diff	Format-specific merge too complex
Realtime docs (Google Docs)	OT or CRDT	Designed for concurrent editing
Database files	Application-specific	Requires semantic understanding
Config files	3-way merge + validation	Merge then validate syntax

Three-Way Merge Deep Dive

Three-way merge is the most sophisticated automatic resolution technique for text files. It uses the common ancestor to determine what each version changed, then combines non-conflicting changes.

How Three-Way Merge Works:

Common Ancestor (Base):    Version A (Local):       Version B (Remote):
┌─────────────────────┐    ┌─────────────────────┐   ┌─────────────────────┐
│ Line 1: Hello       │    │ Line 1: Hello       │   │ Line 1: Hello       │
│ Line 2: World       │    │ Line 2: World       │   │ Line 2: Everyone    │ ← B changed
│ Line 3: Foo         │    │ Line 3: Bar         │ ← │ Line 3: Foo         │  A changed
│ Line 4: End         │    │ Line 4: End         │   │ Line 4: End         │
└─────────────────────┘    └─────────────────────┘   └─────────────────────┘

Merge Analysis:
- Line 1: Same in all → keep
- Line 2: A=Base, B changed → use B's version
- Line 3: A changed, B=Base → use A's version
- Line 4: Same in all → keep

Merged Result:
┌─────────────────────┐
│ Line 1: Hello       │
│ Line 2: Everyone    │ ← From B
│ Line 3: Bar         │ ← From A  
│ Line 4: End         │
└─────────────────────┘

✓ Merge successful! Both changes preserved.

When Three-Way Merge Fails (True Conflict):

Base:              Version A:          Version B:
│ Line 2: World    │ Line 2: Earth     │ Line 2: Everyone
                         ↓                    ↓
                   A changed to Earth  B changed to Everyone

BOTH changed the same line differently!

Result: CONFLICT - cannot auto-merge

Typical output:
┌─────────────────────┐
│ Line 1: Hello       │
│ <<<<<<< LOCAL       │
│ Line 2: Earth       │
│ =======             │
│ Line 2: Everyone    │
│ >>>>>>> REMOTE      │
│ Line 3: Bar         │
└─────────────────────┘

User must manually resolve the conflict markers.

three_way_merge.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
// Simplified three-way merge algorithm
type MergeResult = 
    | { status: 'success'; content: string[] }
    | { status: 'conflict'; conflicts: ConflictRegion[] };
 
interface ConflictRegion {
    lineStart: number;
    localContent: string[];
    remoteContent: string[];
}
 
function threeWayMerge(
    base: string[],
    local: string[],
    remote: string[]
): MergeResult {
    // Compute differences
    const localDiff = computeDiff(base, local);   // What local changed
    const remoteDiff = computeDiff(base, remote); // What remote changed
    
    const result: string[] = [];
    const conflicts: ConflictRegion[] = [];
    
    let baseIdx = 0, localIdx = 0, remoteIdx = 0;
    
    while (baseIdx < base.length || localIdx < local.length || remoteIdx < remote.length) {
        const localChanged = localDiff.hasChange(baseIdx);
        const remoteChanged = remoteDiff.hasChange(baseIdx);
        
        if (!localChanged && !remoteChanged) {
            // Neither changed - keep base
            result.push(base[baseIdx]);
            baseIdx++; localIdx++; remoteIdx++;
        } 
        else if (localChanged && !remoteChanged) {
            // Only local changed - use local
            const change = localDiff.getChange(baseIdx);
            result.push(...change.newLines);
            baseIdx += change.baseLines;
            localIdx += change.newLines.length;
            remoteIdx += change.baseLines;
        }
        else if (!localChanged && remoteChanged) {
            // Only remote changed - use remote  
            const change = remoteDiff.getChange(baseIdx);
            result.push(...change.newLines);
            baseIdx += change.baseLines;
            localIdx += change.baseLines;
            remoteIdx += change.newLines.length;
        }
        else {
            // BOTH changed - check if same change
            const localChange = localDiff.getChange(baseIdx);
            const remoteChange = remoteDiff.getChange(baseIdx);
            
            if (arraysEqual(localChange.newLines, remoteChange.newLines)) {
                // Same change - no conflict
                result.push(...localChange.newLines);
            } else {
                // Different changes - TRUE CONFLICT
                conflicts.push({
                    lineStart: result.length,
                    localContent: localChange.newLines,
                    remoteContent: remoteChange.newLines,
                });
                // Add conflict markers
                result.push('<<<<<<< LOCAL');
                result.push(...localChange.newLines);
                result.push('=======');
                result.push(...remoteChange.newLines);
                result.push('>>>>>>> REMOTE');
            }
            baseIdx += Math.max(localChange.baseLines, remoteChange.baseLines);
            localIdx += localChange.newLines.length;
            remoteIdx += remoteChange.newLines.length;
        }
    }
    
    return conflicts.length > 0 
        ? { status: 'conflict', conflicts }
        : { status: 'success', content: result };
}

Semantic Conflict Detection

Operational Transformation and CRDTs

Operational Transformation (OT):

OT works by transforming operations against concurrent operations to preserve intent:

Initial: "HELLO"

User A: INSERT 'X' at position 1 → "HXELLO"
User B: INSERT 'Y' at position 3 → "HELYLO"

Problem: If we apply A's operation to B's result:
  "HELYLO" + INSERT 'X' at position 1 → "HXELYLO"
  
But B expected position 3 to be after 'L'!

OT Solution: Transform B's operation against A's:
  A inserted at 1, which is before B's position 3
  So B's position shifts: 3 + 1 = 4
  New B operation: INSERT 'Y' at position 4
  
Apply in sequence:
  "HELLO" → "HXELLO" → "HXELYLO"
  
Both insertions preserved at intended locations! ✓

OT Characteristics

•Operations transformed, not data
•Requires central server for ordering
•Proven at scale (Google Docs)
•Complex transformation functions
•Hard to prove correctness
•Best for text editing

CRDT Characteristics

•Data structures designed for merge
•Truly decentralized, no server needed
•Mathematical guarantee: no conflicts
•Higher storage/bandwidth overhead
•Easier to prove correctness
•Works for any data type

CRDT Example - G-Counter (Grow-only Counter):

// Each node maintains its own counter
Node A: {A: 5, B: 3}  // A incremented 5 times, knows B did 3
Node B: {A: 2, B: 7}  // B incremented 7 times, knows A did 2

Merge: Take maximum of each component
       {A: max(5,2), B: max(3,7)} = {A: 5, B: 7}

Total count: 5 + 7 = 12

This works because:
- Each node only increments its own component
- Merge is commutative: merge(A,B) = merge(B,A)
- Merge is idempotent: merge(A,A) = A
- Merge is associative: merge(A,merge(B,C)) = merge(merge(A,B),C)

Common CRDT Types:

CRDT	Use Case	How It Works
G-Counter	Likes, views	Each node tracks own increments, merge = max
PN-Counter	Balance, inventory	Two G-Counters (positive, negative)
LWW-Register	Last-write-wins cell	Timestamp determines winner
OR-Set	Sets with add/remove	Each element tagged with unique ID
RGA	Collaborative text	Characters have position IDs, never deleted
Automerge	JSON documents	Combines multiple CRDTs for complex structures

The CRDT Trade-off

Conflict Prevention Strategies

The best conflict is one that never happens. Production systems employ various strategies to prevent conflicts before they occur, reducing the need for complex resolution.

Prevention Strategies

•File Locking — Acquire exclusive lock before editing. Only one user can edit at a time. Works for desktop apps (Office). Problematic for web apps and offline use.
•Live Presence — Show who's currently viewing/editing. Users naturally avoid editing same sections. Reduces accidental conflicts through social awareness.
•Section Locking — Lock specific sections (paragraphs, cells) rather than entire file. Finer granularity allows more concurrent editing.
•Automatic Partitioning — Design data model to reduce conflicts. Each user edits their own record. Conflicts only possible on shared records.
•Frequent Syncing — Sync more frequently = smaller divergence window = fewer/smaller conflicts. Real-time sync (OT/CRDT) takes this to extreme.
•Conflict Warning — Notify users when someone else is editing. 'Sarah is also editing this file' prompts coordination.

Lock Types and Trade-offs
Lock Type	Granularity	Compatibility	Deadlock Risk
Exclusive (Write)	Entire file	Blocks all other writers	Possible if nested locks
Shared (Read)	Entire file	Multiple readers OK	None
Section Lock	Paragraph/cell	Other sections editable	Low
Intent Lock	Hierarchical	Signals upcoming access	Medium
Optimistic Lock	Conceptual	Check at commit time	None (but conflicts)

Lock-Free Optimistic Approach (Most Common in Cloud):

Cloud storage systems typically don't use locks because:

Locks don't work well with offline editing
Lock state is hard to maintain across network partitions
Abandoned locks can block files forever

Instead, they use optimistic concurrency:

1. User A starts editing (no lock acquired)
2. User B also starts editing (no lock acquired)
3. User A saves → succeeds (becomes revision 5)
4. User B saves with parent=4 → CONFLICT detected
   (server has revision 5, not 4)
5. User B must resolve conflict before saving

This approach maximizes concurrency at the cost of occasional conflicts, which is the right trade-off for most collaboration scenarios.

The Lease Pattern

Conflict Edge Cases

Real-world systems encounter edge cases that naive implementations handle poorly. Let's examine common edge cases and their solutions:

File System Conflicts

•Delete vs Edit — User A deletes file, User B edits same file. Resolution: Keep both actions—restore file with B's edits. Never lose active work.
•Rename Conflicts — Both users rename same file to different names. Resolution: One wins (first to sync), other creates 'file (was: oldname).txt'.
•Directory Conflicts — User A creates folder/file.txt, User B creates folder/file.txt with different content. Resolution: Regular file conflict handling.
•Move Cycles — User A moves folder X into folder Y, User B moves folder Y into folder X. Resolution: Reject one move, apply other. Cycle detection required.
•Case Sensitivity — User A on Linux creates 'File.txt', User B on Windows creates 'file.txt'. Resolution: Rename on case-insensitive systems.

edge_case_handler.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
// Edge case handlers for file system conflicts
class EdgeCaseHandler {
    
    // Delete vs Edit: User A deletes, User B edits
    handleDeleteVsEdit(
        deleteOp: DeleteOperation,
        editOp: EditOperation
    ): Resolution {
        // Never lose edits - restore file with new edits
        return {
            action: 'restore_and_apply',
            steps: [
                { op: 'create', path: editOp.path, content: editOp.newContent },
                { 
                    op: 'notify', 
                    user: deleteOp.userId,
                    message: `${editOp.userEmail} edited ${editOp.path} ` +
                                `which you deleted. File has been restored.`
                }
            ]
        };
    }
    
    // Rename to same name by different users
    handleRenameNameConflict(
        rename1: RenameOperation,
        rename2: RenameOperation
    ): Resolution {
        // Both trying to rename to 'report.docx'
        // First one wins, second gets deduplicated name
        const winner = rename1.timestamp < rename2.timestamp ? rename1 : rename2;
        const loser = winner === rename1 ? rename2 : rename1;
        
        return {
            action: 'deduplicate',
            steps: [
                { op: 'rename', from: winner.oldPath, to: winner.newPath },
                { 
                    op: 'rename', 
                    from: loser.oldPath, 
                    to: this.deduplicateName(loser.newPath)  // 'report (1).docx'
                }
            ]
        };
    }
    
    // Folder move creating cycle
    handleMoveCycle(
        moveA: MoveOperation,  // X into Y
        moveB: MoveOperation   // Y into X
    ): Resolution {
        // Detect cycle: if we apply both, neither can be root
        // Resolution: Apply first, reject second with explanation
        const first = moveA.timestamp < moveB.timestamp ? moveA : moveB;
        const second = first === moveA ? moveB : moveA;
        
        return {
            action: 'partial_apply',
            steps: [
                { op: 'move', operation: first },
                { 
                    op: 'reject', 
                    operation: second,
                    reason: 'Would create folder cycle'
                },
                {
                    op: 'notify',
                    user: second.userId,
                    message: `Could not move ${second.sourcePath} - would create ` +
                            `circular structure.`
                }
            ]
        };
    }
    
    // Cross-platform case sensitivity
    handleCaseSensitivityConflict(
        file1: string,  // 'README.md' (created on Linux)
        file2: string   // 'readme.md' (created on Windows)
    ): Resolution {
        // These are same file on Windows/macOS, different on Linux
        // Auto-rename one to avoid conflict on case-insensitive systems
        return {
            action: 'auto_rename',
            steps: [
                { 
                    op: 'rename',
                    from: file2,
                    to: this.addSuffix(file2, '-1'),  // 'readme-1.md'
                    platform: 'case_insensitive'
                },
                {
                    op: 'notify',
                    broadcast: true,
                    message: `Renamed '${file2}' to avoid conflict with '${file1}' ` +
                        `on case-insensitive systems.`
                }
            ]
        };
    }
}

The Emoji/Unicode Problem

Conflict Resolution UX

Technical conflict resolution is only half the battle. Users must understand what happened and how to resolve it. Poor UX turns minor conflicts into major user frustration.

UX Best Practices

•Clear Naming — Conflict copies should clearly indicate source. Include device name and timestamp: 'report (John's MacBook 2024-01-15 14:30).docx' is clearer than 'report (1).docx'.
•Proactive Notification — Alert users immediately when conflicts occur. Don't let conflicts accumulate unnoticed for weeks. Desktop notifications + email for important files.
•Visual Diff — Provide a visual comparison tool to see differences. Highlight changes, additions, deletions. Make it easy to cherrypick changes from each version.
•One-Click Resolution — Offer quick actions: 'Keep This', 'Keep That', 'Keep Both'. Most users don't want to manually merge—give them simple choices.
•Undo/Recovery — Make it easy to undo a resolution choice. Keep the 'losing' version in revision history. Users fear making the wrong choice.
•Prevention Hints — After resolution, suggest how to avoid future conflicts: 'Consider using live collaboration for this shared document.'

Conflict Resolution Flows by Provider
Provider	Detection	Notification	Resolution
Dropbox	Server-side on upload	System notification + badge	Both files visible, manual merge
Google Drive	Real-time (OT)	In-editor banner	Auto-merge or fork document
OneDrive	Server-side on sync	Activity center	Keep both + visual diff tool
iCloud	Device sync	Finder displays both	Choose version to keep
Git	Merge/pull time	CLI output	Edit conflict markers, commit

The 'Conflicted Copy' Fatigue

Summary: Mastering Conflict Resolution

Conflict resolution is one of the most challenging aspects of distributed file systems. Let's consolidate the key insights:

Key Takeaways

•Detection uses version tracking — Server revisions or version vectors identify when concurrent modifications create conflicts. Same-content conflicts can be optimized away.
•Copy-Both is the safe default — For binary files and complex documents, keeping both versions ensures no data loss. Last-Write-Wins loses data silently—never use it for important files.
•Three-way merge works for text — Using the common ancestor, we can automatically merge non-overlapping changes. True conflicts require user intervention.
•OT and CRDTs enable real-time collaboration — When millisecond latency is required, transform operations (OT) or design conflict-free data structures (CRDTs) eliminate the conflict problem.
•Prevention beats resolution — Locking, presence awareness, and frequent sync reduce conflicts. Optimistic concurrency with good detection is the cloud storage standard.
•UX makes or breaks the feature — Clear naming, proactive notification, visual diffs, and one-click resolution turn conflicts from frustrating to manageable.

What's Next:

Page Complete

3 / 6