Loading learning content...
You add an object to a HashSet. Later, you modify one of its fields. You try to find it again—and it's gone. Not removed, not null, just... invisible. The object is still in the set's memory, but you can never access it again. It's a ghost in the machine.
This isn't a bug in the collection. It's a violation of the consistency requirement—one of the most subtle and dangerous aspects of equals() and hashCode(). This final page explores what consistency means, why it matters, and how to design objects that never haunt your collections.
By the end of this page, you will understand: (1) What consistency means for equals and hashCode, (2) How mutable fields corrupt hash collections, (3) The relationship between immutability and consistency, (4) Strategies for handling mutable objects, and (5) Best practices for production systems.
The equals() and hashCode() contracts include a crucial requirement that's easy to overlook:
For equals(): Multiple invocations of equals() on the same two objects must consistently return the same result, provided no information used in equals comparisons is modified.
For hashCode(): Multiple invocations of hashCode() on the same object must consistently return the same integer, provided no information used in equals comparisons is modified.
The key phrase is 'provided no information used in equals comparisons is modified.' This creates a contract between you and the collections framework:
If you change fields used in equals/hashCode, the object's identity changes, and all bets are off.
| Scenario | Consistency | Collection Behavior |
|---|---|---|
| Object unchanged | Guaranteed | Works correctly |
| Non-equality fields changed | Guaranteed | Works correctly |
| Equality fields changed | VIOLATED | UNPREDICTABLE |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
public class User { private String username; // Used in equals/hashCode @Override public boolean equals(Object o) { if (!(o instanceof User)) return false; return username.equals(((User) o).username); } @Override public int hashCode() { return username.hashCode(); } public void setUsername(String username) { this.username = username; // CHANGES EQUALITY FIELD! }} // The disaster unfolds:Set<User> users = new HashSet<>();User alice = new User("alice"); // Step 1: Add to setusers.add(alice);// alice.hashCode() = "alice".hashCode() = 92668751// Stored in bucket 92668751 % 16 = 7 // Step 2: Verify presenceSystem.out.println(users.contains(alice)); // true // Step 3: Mutate the equality fieldalice.setUsername("alice_updated");// alice.hashCode() NOW = "alice_updated".hashCode() = -1424633534// Would map to bucket -1424633534 % 16 = 2 // Step 4: Try to find againSystem.out.println(users.contains(alice)); // FALSE!// Looks in bucket 2, but alice is in bucket 7 // Step 5: Even worse - can't remove it either!users.remove(alice); // Returns false - can't find itSystem.out.println(users.size()); // Still 1! Ghost object! // The object is PERMANENTLY STUCK in the set// until the set is garbage collectedA ghost object is an entry in a hash collection that cannot be found, removed, or accessed through the collection's normal APIs. It occupies memory, counts toward size(), causes memory leaks, and can never be cleaned up. This is one of the most insidious bugs in long-running applications.
Let's trace exactly what happens when you mutate an object in a hash collection. Understanding this mechanism helps you recognize and avoid the pattern.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
TIME 0: Initial State══════════════════════════════════════════════════════════════HashSet buckets (16 total)Bucket 7: [User(username="alice", hashCode=92668751)]Other buckets: empty set.contains(new User("alice")) → 1. Compute hash: 92668751 2. Find bucket: 7 3. Search bucket 7: Found! Result: TRUE ✓ ══════════════════════════════════════════════════════════════TIME 1: Mutation Occurs alice.setUsername("alice_updated")══════════════════════════════════════════════════════════════ What DOESN'T happen: ✗ The HashSet is not notified ✗ The object is not moved to a new bucket ✗ The internal index is not updated What DOES happen: The object's state changes, but it stays in bucket 7 HashSet buckets (unchanged!):Bucket 7: [User(username="alice_updated", hashCode=-1424633534)] ↑ New hash, WRONG bucket! ══════════════════════════════════════════════════════════════TIME 2: Lookup Attempt set.contains(alice)══════════════════════════════════════════════════════════════ 1. Compute hash: -1424633534 (NEW hash) 2. Find bucket: 2 (WRONG bucket) 3. Search bucket 2: Empty! Result: FALSE ✗ The object is in bucket 7 but we looked in bucket 2.We will NEVER find it through normal operations. ══════════════════════════════════════════════════════════════TIME 3: Attempted Removal set.remove(alice)══════════════════════════════════════════════════════════════ 1. Compute hash: -1424633534 2. Find bucket: 2 3. Search bucket 2: Nothing to remove Result: FALSE (not removed) GHOST OBJECT CREATED: - size() returns 1 - contains() returns false for all queries - remove() cannot find it - iterator() might return it (implementation-dependent) - Memory leak: object and bucket entry never freedHash collections don't monitor your objects for changes—they trust that you won't modify fields used in hashCode() while the object is stored. This trust model is why mutation causes silent corruption rather than exceptions.
The cleanest solution to the consistency problem is immutability. If the fields used in equals() and hashCode() cannot change, consistency is guaranteed.
The Immutable Approach:
final123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
/** * An immutable user identifier. * * Safe to use as a HashMap key or HashSet element. * Equality fields (id, domain) are final and cannot change. */public final class UserId { private final String id; private final String domain; // Optional cached hash code (safe for immutable objects) private int cachedHashCode; public UserId(String id, String domain) { // Validate and store - these can never change this.id = Objects.requireNonNull(id, "id cannot be null"); this.domain = Objects.requireNonNull(domain, "domain cannot be null"); } // Only getters, no setters public String getId() { return id; } public String getDomain() { return domain; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; UserId userId = (UserId) o; return id.equals(userId.id) && domain.equals(userId.domain); } @Override public int hashCode() { // Lazy initialization of cached hash int h = cachedHashCode; if (h == 0) { h = Objects.hash(id, domain); cachedHashCode = h; } return h; } // Factory method for clarity public static UserId of(String id, String domain) { return new UserId(id, domain); } // Need a different id? Create a new object public UserId withId(String newId) { return new UserId(newId, this.domain); }} // Usage - impossible to corrupt a collection:Map<UserId, UserData> users = new HashMap<>();UserId aliceId = UserId.of("alice", "example.com"); users.put(aliceId, new UserData(...)); // This is the ONLY way to change the ID - creates new objectUserId updatedId = aliceId.withId("alice_new"); // Original still works:users.get(aliceId); // Still found!users.get(updatedId); // Not found (different key)Java records, Kotlin data classes, Scala case classes, and Python's frozen dataclasses all create immutable objects with correct equals/hashCode automatically. Prefer these constructs when possible—they eliminate entire categories of bugs.
Sometimes immutability isn't practical. Entities in ORMs, domain objects with evolving state, and legacy code may require mutable objects. Here are strategies for handling them safely.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
/** * Entity with stable identity (UUID) separate from mutable state. */public class Order { // STABLE IDENTITY - never changes after construction private final UUID orderId; // MUTABLE STATE - can change freely private OrderStatus status; private BigDecimal total; private List<OrderItem> items; private LocalDateTime lastModified; public Order() { this.orderId = UUID.randomUUID(); // Immutable identity this.status = OrderStatus.DRAFT; } // Status changes DON'T affect equality/hashCode public void submit() { this.status = OrderStatus.SUBMITTED; this.lastModified = LocalDateTime.now(); } public void addItem(OrderItem item) { this.items.add(item); recalculateTotal(); } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; Order order = (Order) o; return orderId.equals(order.orderId); // ONLY identity field } @Override public int hashCode() { return orderId.hashCode(); // ONLY identity field }} // Now safe to use in collections despite mutations:Set<Order> orders = new HashSet<>();Order order = new Order();orders.add(order); order.addItem(new OrderItem("Widget", 2)); // Mutate stateorder.submit(); // More mutation orders.contains(order); // Still TRUE! Identity unchanged123456789101112131415161718192021
// PATTERN: Remove-Modify-Add (use sparingly!)public void updateUsername(Set<User> users, User user, String newName) { // Step 1: Remove with current hash boolean wasPresent = users.remove(user); // Step 2: Modify user.setUsername(newName); // Step 3: Re-add with new hash if (wasPresent) { users.add(user); }} // This works but is:// - Error-prone (forget step 1 or 3 = bug)// - Not thread-safe// - Requires access to the collection// - A code smell suggesting wrong design // BETTER: Use Strategy 1 (stable identity) instead123456789101112131415161718
// IdentityHashMap: uses == for keys, not equals()Map<User, String> userNotes = new IdentityHashMap<>(); User alice = new User("alice");userNotes.put(alice, "VIP customer"); // Even if we mutate alice, she's found by identity (memory address)alice.setUsername("alice_updated");userNotes.get(alice); // "VIP customer" - still found! // CAVEAT: Different semanticsUser aliceCopy = new User("alice_updated");userNotes.get(aliceCopy); // null - different object! // Use when:// - Tracking specific object instances// - Implementing caches where identity matters// - Working with objects you can't modifyJPA/Hibernate entities often use database ID for equality. The pattern: use a surrogate key (Long id) that's stable after persist. Be careful with transient entities (id is null before save) and consider using natural keys for truly immutable identity.
Let's crystallize the relationship between equals() and hashCode() with a complete mental model.
12345678910111213141516171819202122232425262728293031323334353637383940414243
THE FOUR RULES OF EQUALS AND HASHCODE═══════════════════════════════════════════════════════════════════ RULE 1: Equals implies same hash (MANDATORY)────────────────────────────────────────────── a.equals(b) == true → a.hashCode() == b.hashCode() You MUST satisfy this. Violation breaks HashMap, HashSet, etc. RULE 2: Same hash does NOT imply equals (EXPECTED)────────────────────────────────────────────────── a.hashCode() == b.hashCode() ↛ a.equals(b) Hash collisions are normal. Different objects can share a hash. Collections handle collisions by chaining in buckets. RULE 3: Different hash implies NOT equals (DERIVED)──────────────────────────────────────────────────── a.hashCode() != b.hashCode() → a.equals(b) == false This is just the contrapositive of Rule 1. Collections use this for fast rejection: different hash = not equal. RULE 4: Use same fields for both methods (BEST PRACTICE)──────────────────────────────────────────────────────── If equals() compares {a, b, c}, hashCode() combines {a, b, c} Not technically required, but guarantees Rule 1 is satisfied. ═══════════════════════════════════════════════════════════════════ VISUALIZATION: Hash as a Filter──────────────────────────────────────────────────────────────────── hashCode FILTER equals JUDGE ┌─────────────┐ ┌─────────────┐All Objects │ Puts objects│ Bucket │ Precise │ Result ───────▶│ into buckets│ Candidates │ comparison │ ───────▶ │ (fast) │─────────────▶│ (slower) │ └─────────────┘ └─────────────┘ hashCode: "Roughly where should I look?" (coarse filter)equals: "Is this exactly what I want?" (fine judgment)hashCode() provides an approximate grouping ('these objects might be equal'). equals() provides the definitive answer ('these objects are equal'). The approximation must be sound: if objects are equal, they must be in the same approximate group.
A correct equals/hashCode implementation should be verified through testing. Here's a comprehensive test suite pattern.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798
import static org.junit.jupiter.api.Assertions.*;import org.junit.jupiter.api.Test;import nl.jqno.equalsverifier.EqualsVerifier; class UserIdTest { // AUTOMATED: Use EqualsVerifier library for comprehensive checking @Test void equalsContractVerified() { EqualsVerifier.forClass(UserId.class) .verify(); // Checks reflexivity, symmetry, transitivity, etc. } // MANUAL: Explicit tests for documentation and edge cases @Test void reflexive_objectEqualsItself() { UserId id = new UserId("alice", "example.com"); assertEquals(id, id, "Object should equal itself"); } @Test void symmetric_equalObjectsInBothDirections() { UserId id1 = new UserId("alice", "example.com"); UserId id2 = new UserId("alice", "example.com"); assertEquals(id1, id2, "id1 should equal id2"); assertEquals(id2, id1, "id2 should equal id1 (symmetric)"); } @Test void transitive_equalityChains() { UserId id1 = new UserId("alice", "example.com"); UserId id2 = new UserId("alice", "example.com"); UserId id3 = new UserId("alice", "example.com"); assertEquals(id1, id2); assertEquals(id2, id3); assertEquals(id1, id3, "If id1=id2 and id2=id3, then id1=id3"); } @Test void nullSafe_neverEqualsNull() { UserId id = new UserId("alice", "example.com"); assertNotEquals(null, id, "Should not equal null"); } @Test void consistent_multipleCallsSameResult() { UserId id1 = new UserId("alice", "example.com"); UserId id2 = new UserId("alice", "example.com"); // Multiple calls should return same result for (int i = 0; i < 100; i++) { assertTrue(id1.equals(id2)); assertEquals(id1.hashCode(), id2.hashCode()); } } @Test void hashCodeConsistent_equalObjectsSameHash() { UserId id1 = new UserId("alice", "example.com"); UserId id2 = new UserId("alice", "example.com"); assertEquals(id1, id2); // Precondition assertEquals(id1.hashCode(), id2.hashCode(), "Equal objects MUST have same hashCode"); } @Test void worksInHashSet() { Set<UserId> set = new HashSet<>(); UserId id1 = new UserId("alice", "example.com"); UserId id2 = new UserId("alice", "example.com"); // Equal to id1 set.add(id1); assertTrue(set.contains(id1)); assertTrue(set.contains(id2), "Equal object should be found"); set.add(id2); // Should not add duplicate assertEquals(1, set.size(), "Should not allow duplicates"); } @Test void worksAsHashMapKey() { Map<UserId, String> map = new HashMap<>(); UserId id1 = new UserId("alice", "example.com"); UserId id2 = new UserId("alice", "example.com"); // Equal to id1 map.put(id1, "original"); assertEquals("original", map.get(id2), "Equal key should find value"); map.put(id2, "updated"); // Should overwrite assertEquals(1, map.size()); assertEquals("updated", map.get(id1)); }}The EqualsVerifier library (for Java) automatically tests all contract properties with carefully crafted edge cases. One line of code replaces dozens of manual tests. There are similar libraries for other languages (e.g., hypothesis for Python).
Based on everything we've learned, here are battle-tested best practices for production code.
| Scenario | equals() Fields | hashCode() Fields | Notes |
|---|---|---|---|
| Value Object | All meaningful fields | Same as equals | Consider immutability |
| Entity (ORM) | ID only (after persist) | ID only | Handle transient state carefully |
| DTO | All fields | All fields | Transfer objects are values |
| Singleton | Don't override (use identity) | Don't override | Only one instance exists |
| Enum | Don't override (identity is correct) | Don't override | Each constant is unique |
1234567891011121314151617181920212223242526
// MODERN JAVA 16+: Use Recordspublic record UserId(String id, String domain) { // equals(), hashCode(), toString() generated automatically // All fields are final (immutable) // Compact, correct, and clear} // KOTLIN: Use data classdata class UserId(val id: String, val domain: String)// Same benefits as Java record // PYTHON 3.7+: Use frozen dataclassfrom dataclasses import dataclass @dataclass(frozen=True)class UserId: id: str domain: str # __eq__ and __hash__ generated, immutable // These modern constructs:// ✓ Generate correct equals/hashCode// ✓ Are immutable by default// ✓ Reduce boilerplate// ✓ Are less error-prone// ✓ Communicate intent clearlyWe've completed our deep dive into object identity, equality, and hashing. Let's consolidate the essential knowledge from this entire module.
These concepts are fundamental to professional software development. Nearly every bug involving 'object not found in collection' or 'duplicate entries' traces back to incorrect identity/equality implementations. Master these concepts, and you'll avoid an entire category of subtle, time-consuming bugs.
Moving Forward:
With a solid understanding of how objects relate to each other through identity and equality, you're prepared to design clean, correct classes that work reliably in collections, comparisons, and throughout your systems.
You've mastered object identity, equality, and hashing—from conceptual understanding to correct implementation to production best practices. You can now design objects that work correctly in hash collections, avoid the ghost object problem, and implement the equals/hashCode contract without errors.