Loading content...
A candidate designs a beautiful messaging system. The architecture flows logically—users send messages through an API, messages are stored in a database, recipients retrieve them. The interviewer nods along, then asks: "What happens when you have a million concurrent users? How do you guarantee message delivery? What's your latency target for message receipt?"
The candidate pauses. They hadn't thought about any of that.
This is the second most damaging mistake in system design interviews: treating non-functional requirements (NFRs) as optional. The candidate built something that would work on a laptop. The interviewer wanted something that would work in production at scale.
The distinction between a system that works and a system that works well is entirely defined by non-functional requirements. Ignoring them doesn't just miss interview points—it reveals a fundamental gap in engineering maturity.
Many engineers spend most of their time building features—functional requirements. NFRs are often handled by infrastructure teams, senior architects, or inherited from existing systems. This creates a dangerous blind spot: engineers who can build anything but can't reason about whether their build will survive real-world conditions.
Non-functional requirements describe how a system should behave, as opposed to what it should do. They're also called quality attributes, operational requirements, or system properties. While functional requirements describe features, NFRs describe constraints and qualities.
The critical NFRs for system design:
| NFR | Definition | Example Specification | Why It Matters |
|---|---|---|---|
| Scalability | Ability to handle growth in load, data, or users | Support 10x traffic increase with linear cost scaling | Systems that can't scale become bottlenecks or require rewrites |
| Availability | Percentage of time the system is operational | 99.99% uptime (52 minutes downtime/year) | Unavailability directly translates to lost revenue and user trust |
| Latency | Time to complete an operation | p99 response time < 100ms for API calls | User experience and system usability depend on responsiveness |
| Consistency | Data correctness guarantees across operations | Strong consistency for account balances | Incorrect data can cause business-critical failures |
| Durability | Data survival across failures | Zero data loss for confirmed writes | Lost data destroys user trust and may have legal implications |
| Security | Protection against unauthorized access | Encryption in transit and at rest; audit logging | Breaches cause legal, financial, and reputational damage |
| Maintainability | Ease of operating and evolving the system | New features deployable in < 1 week | Systems that can't evolve become legacy anchors |
Why NFRs are easy to miss:
NFRs are often implicit. When someone says "build a URL shortener," they don't explicitly say "and it should handle 10,000 requests per second, never lose data, and be available 99.9% of the time." But these requirements exist, and ignoring them produces toy solutions.
In an interview, the interviewer expects you to surface these requirements through questions. Failing to ask about NFRs signals that you would ship production systems without considering operational characteristics—a concerning liability for any engineering team.
Functional requirements are the visible tip of the iceberg—what users see and interact with. Non-functional requirements are the 90% below the waterline—invisible to users but essential for the system to actually work. Interviews probe both, but NFRs often carry more weight for senior roles.
Ignoring NFRs isn't a single behavior—it manifests in several recognizable patterns. Understanding these patterns helps you catch yourself before they damage your interview performance.
The prompt: "Design a system for storing and serving user photos, like Instagram's core storage."
The NFR-ignoring response: "We'll have an API that receives photo uploads, stores them in a database, and serves them when requested. I'll use a CDN for faster access. For the database, we can use PostgreSQL with the photo as a blob column..."
What's missing?
Without addressing these, the candidate has designed a college project, not Instagram.
When a candidate ignores NFRs, interviewers conclude: "This person has never operated a production system. They don't understand what makes systems succeed or fail in the real world. They would require constant senior oversight." For senior roles, this is disqualifying.
Top candidates don't just remember to ask about NFRs—they follow a systematic approach that ensures comprehensive coverage. This framework should become second nature.
Always start here. Scale determines almost everything else.
Essential scale questions:
Why this matters: A system for 1,000 users has fundamentally different architecture than one for 100 million. Caching strategies, database choices, replication needs—everything changes with scale. You cannot make intelligent design decisions without understanding scale.
How much downtime is acceptable?
Essential availability questions:
Understanding the numbers:
| Availability | Downtime/Year | Downtime/Day | Typical Use Case |
|---|---|---|---|
| 99% | 3.65 days | ~15 minutes | Internal tools |
| 99.9% | 8.76 hours | ~86 seconds | Business apps |
| 99.99% | 52.6 minutes | ~8.6 seconds | E-commerce, payments |
| 99.999% | 5.26 minutes | ~0.86 seconds | Critical infrastructure |
Each 9 you add requires exponentially more architectural investment. Designing for 99.999% without knowing it's needed wastes engineering effort.
How fast must operations complete?
Essential latency questions:
Why percentiles matter: Average latency hides problems. A system with 50ms average latency but 5-second p99 latency means 1% of users wait 100x longer. That 1% often includes your most engaged users (who make more requests), causing outsized negative impact.
What data correctness guarantees are required?
Essential consistency questions:
The consistency-availability trade-off: The CAP theorem tells us we can't have perfect consistency and perfect availability during network partitions. Strong consistency typically costs availability or latency. Understanding requirements helps navigate this trade-off.
Eliciting NFRs is only half the job. You must demonstrate how requirements drive design choices. This is where senior candidates distinguish themselves—they trace every architectural decision back to a requirement.
The decision traceability pattern:
"Because we need 99.99% availability [requirement], we'll deploy across multiple availability zones with automatic failover [design decision]. This trades some cost [trade-off] for resilience against datacenter failures [justification]."
This pattern—requirement → decision → trade-off → justification—demonstrates sophisticated engineering thinking.
| NFR | Requirement Example | Design Decision | Trade-off |
|---|---|---|---|
| Scalability | 10x traffic growth in 2 years | Stateless services + horizontal scaling | Increased operational complexity |
| Availability | 99.99% uptime | Multi-AZ deployment with failover | 2-3x infrastructure cost |
| Latency | < 100ms p99 response | Regional edge caching + CDN | Cache invalidation complexity |
| Consistency | Strong consistency for payments | Synchronous replication with consensus | Higher latency, lower availability |
| Durability | Zero data loss for user content | Multi-region replication + WAL | Increased write latency |
| Security | PCI compliance for payments | Encryption at rest + tokenization | Performance overhead, complexity |
Problem: Design a real-time bidding system for online advertising.
NFR Discovery:
Design decisions driven by NFRs:
Edge processing (driven by latency requirement): Bid servers at edge locations, not centralized, because network latency would consume the entire budget
In-memory data stores (driven by latency requirement): No time for disk I/O. All campaign data cached in memory with background sync
Eventual consistency (driven by consistency requirement + latency): Campaign updates propagate asynchronously; a few stale bids are acceptable
Multi-region active-active (driven by availability + geography): No single point of failure; each region operates independently
Aggressive timeouts (driven by latency): If a component doesn't respond in 10ms, skip it rather than delay the response
Every decision traces directly to an NFR. This is the thought process interviews evaluate.
Don't assume interviewers will follow your reasoning. Explicitly state the connection: "Given the latency requirement of 100ms, I'm choosing in-memory caching because database round-trips would take 50-100ms alone." This demonstrates intentional design rather than pattern-matching.
Different system types have different critical NFRs. Knowing which NFRs matter most for each system type prevents you from asking irrelevant questions while missing essential ones.
| System Type | Most Critical NFRs | Commonly Overlooked NFR | Why It's Critical |
|---|---|---|---|
| Messaging (WhatsApp) | Latency, Consistency (ordering) | Delivery guarantees | Users notice missed messages immediately |
| Social Feed (Twitter) | Latency, Scalability | Consistency (timeline) | Stale feeds frustrate users; duplicate posts confuse |
| Payments (Stripe) | Consistency, Durability | Idempotency | Duplicate charges destroy trust |
| Video Streaming (Netflix) | Latency, Availability | Adaptive quality | Buffering causes abandonment |
| Search (Google) | Latency, Relevance | Freshness | Stale results reduce utility |
| Gaming (Fortnite) | Latency, Consistency | Fairness across latencies | High-ping players have bad experience |
| E-commerce (Amazon) | Availability, Consistency | Inventory accuracy | Overselling creates fulfillment nightmares |
Payment systems illustrate how missing a single NFR can invalidate an otherwise good design.
The overlooked requirement: Idempotency
What happens if a payment request is sent twice? Network issues, client retries, and load balancer quirks can all cause duplicate requests. Without idempotency guarantees, a user might be charged twice for the same purchase.
The inexperienced design: "The API receives a payment request, validates the card, and charges the user."
The problem: If the response is lost and the client retries, the user is charged twice.
The mature design: "Each payment request includes an idempotency key. The system checks if this key has been seen before. If yes, return the previous result. If no, process the payment and store the key-result mapping. This makes retries safe."
The lesson: Domain expertise includes knowing which NFRs are critical for that domain. For payments, it's idempotency. For messaging, it's ordering. For gaming, it's latency fairness.
Before interviews, research what NFRs matter for common system types. Reading engineering blogs from companies like Uber, Netflix, and Stripe reveals which requirements they prioritize. This preparation lets you ask informed questions and demonstrate domain awareness.
Here's the advanced insight: NFRs often conflict with each other. You can't maximize everything. Recognizing and articulating these trade-offs separates senior candidates from the rest.
The fundamental tensions:
When you identify a tension, don't just acknowledge it—explain your reasoning:
Weak articulation: "We have a trade-off between consistency and latency."
Strong articulation: "We have a trade-off between consistency and latency. For this use case—a social media feed—users expect sub-100ms response times. Strong consistency would require synchronizing across our database replicas, adding 50-100ms to every read. Users are more tolerant of occasionally seeing a slightly stale feed than waiting an extra 100ms. Therefore, I'd choose eventual consistency here, accepting that a user might not see a brand-new post for up to 30 seconds."
This demonstrates:
The trade-off matrix interview technique:
For complex systems, consider verbalizing a trade-off matrix: "Looking at our requirements, we have tension between latency and consistency. For the timeline read path, I'd prioritize latency. For the write path where we're storing user posts, I'd prioritize consistency to avoid data loss. By splitting the paths, we can optimize each for its priority."
Interviewers don't expect you to choose the 'right' trade-off—there often isn't one. They want to see that you recognize trade-offs exist, can articulate them clearly, and can make a reasoned decision. The quality of your reasoning matters more than the specific choice.
Abstract discussions of NFRs are less impressive than quantitative ones. Having reference numbers in your head allows you to make your points concrete and demonstrates practical experience.
Reference numbers every system designer should know:
| Metric | Typical Value | Context |
|---|---|---|
| L1 cache reference | ~1 ns | The fastest memory access |
| RAM read | ~100 ns | In-memory data structures |
| SSD read | ~100 μs | Fast persistent storage |
| Network round-trip (same datacenter) | ~500 μs | Service-to-service calls |
| HDD read | ~10 ms | Spinning disk access |
| Cross-continent network round-trip | ~100-200 ms | Geo-distributed systems |
| Human perception threshold | ~100 ms | Delays above this feel 'slow' |
| Acceptable web page load | < 3 seconds | Beyond this, users abandon |
Before (vague): "We'll cache the data because database lookups are slow."
After (quantified): "A database read takes about 5-10ms. With caching, we can get that down to under 1ms from in-memory access. Given our 100ms latency budget and the need for multiple lookups per request, caching is essential to hit our target."
The quantified version demonstrates:
Scaling numbers:
| Entity | Order of Magnitude | Reference |
|---|---|---|
| Tweets per day | ~500 million | Twitter's actual scale |
| Messages per day | ~100 billion | WhatsApp's actual scale |
| Searches per day | ~8 billion | Google's approximate scale |
| Videos uploaded per minute | ~500 hours | YouTube's actual scale |
| Transactions per second | ~65,000 | Visa's peak capacity |
Knowing these numbers lets you contextualize interview problems: "We're designing at Twitter scale, so roughly 500 million items per day, or about 6,000 per second average."
You don't need exact numbers. The difference between 5ms and 7ms is rarely significant. What matters is order of magnitude: is it microseconds, milliseconds, or seconds? Is it thousands, millions, or billions? Getting these right demonstrates practical understanding.
Ignoring non-functional requirements is a career-limiting mistake in system design interviews. It signals inexperience with production systems and inability to think beyond features to quality attributes.
Practice exercises:
Take any system design problem and list all possible NFRs before designing. Force yourself to specify numbers for each.
For a system you use daily (Netflix, Uber, Amazon), try to infer its NFRs from user experience. What availability do you observe? What latency?
Practice the trade-off articulation pattern. Pick two conflicting NFRs and explain which you'd prioritize for a specific system and why.
Create flashcards with reference numbers (latencies, scales, availability percentages). Being able to recall these quickly helps in interviews.
Read engineering blogs from major tech companies. Note which NFRs they emphasize for different products.
You now understand why ignoring non-functional requirements derails system design interviews and have frameworks for ensuring comprehensive NFR coverage. In the next page, we'll tackle the third critical mistake: not asking questions—the silence that speaks volumes about engineering maturity.