Loading learning content...
When you type a URL and press Enter, an extraordinary orchestration begins. Within milliseconds, dozens of systems across the globe coordinate to deliver content to your screen. DNS servers translate names to addresses. CDN edge nodes check caches. Load balancers distribute requests. Application servers process logic. Databases retrieve data. Reverse proxies compress and encrypt. Browsers parse, render, and execute.
This is web architecture—the intricate system of components that transforms a simple URL into a fully rendered, interactive web page.
Understanding web architecture is essential for building systems that scale, debugging production issues, and making informed infrastructure decisions. A URL might seem like a direct line to a server, but in reality, it traverses a complex ecosystem carefully designed for performance, reliability, and security.
This page maps the complete web architecture: every major component, how they interact, and why each exists. You'll finish with a comprehensive mental model of web infrastructure—from browser to server and back.
By the end of this page, you will understand the complete request journey from browser to server and back, the role of each infrastructure component (CDNs, load balancers, proxies, caches), how modern web architectures achieve global scale and high availability, browser architecture and rendering pipeline, and practical patterns for designing robust web systems.
Let's trace a complete request journey, from URL entry to rendered page. This reveals the full architecture in context.
Step 1: URL Parsing
You enter https://shop.example.com/products/shoes
The browser parses this into components:
https (protocol)shop.example.com/products/shoesStep 2: DNS Resolution
The browser needs an IP address for shop.example.com:
.com TLD.com TLD servers → Referral to example.com authoritativeexample.com authoritative → Returns IP: 93.184.216.34Actual resolution: typically 20-100ms if uncached, <1ms if cached.
Step 3: Connection Establishment
Browser initiates connection to 93.184.216.34:443:
Connection to CDN edge node in nearby city, not necessarily origin server.
Step 4: HTTP Request
GET /products/shoes HTTP/2
Host: shop.example.com
Accept: text/html,application/xhtml+xml
Accept-Encoding: gzip, br
Cookie: session=abc123
User-Agent: Mozilla/5.0...
Step 5: CDN Processing
Request arrives at CDN edge node:
/products/shoes → Cache miss (dynamic content)Step 6: Load Balancer Distribution
Request reaches load balancer:
Step 7: Application Processing
App Server 3 processes request:
Step 8: Response Journey
Response travels back:
Step 9: Browser Processing
Browser receives HTML:
Total Time Budget:
| Phase | Typical Time |
|---|---|
| DNS Resolution | 0-100ms (cached: <1ms) |
| TCP/TLS Handshake | 50-200ms |
| Request to edge | <10ms |
| Edge to origin | 50-200ms |
| Server processing | 20-500ms |
| Response transit | 20-100ms |
| Browser parsing/rendering | 100-500ms |
| Total | 200ms-1.5s typical |
CDN caching eliminates origin roundtrips for cacheable content (static assets, some HTML). Server processing is often the largest variable—a slow database query dominates the time budget. Browser parallelism (HTTP/2) minimizes resource fetch delays.
Browsers are complex applications with multiple specialized components. Understanding browser architecture explains performance behaviors and debugging strategies.
Major Browser Components:
1. User Interface The browser chrome: address bar, bookmarks, navigation buttons. Not the web content area—that's rendered by the rendering engine.
2. Browser Engine Orchestrates between UI and rendering engine. Manages navigation, history, and coordinates components.
3. Rendering Engine
Parses HTML and CSS, constructs the DOM and CSSOM, calculates layout, and paints pixels.
4. JavaScript Engine
Parses, compiles, and executes JavaScript. Modern engines use JIT (Just-In-Time) compilation for performance.
5. Networking Handles HTTP/HTTPS, WebSocket, and other network protocols. Implements connection pooling, caching, cookie management, and security.
6. UI Backend Draws basic widgets (input boxes, buttons) using native OS capabilities.
7. Data Storage Manages cookies, localStorage, sessionStorage, IndexedDB, and cache storage.
| Stage | Input | Output | Blocking? |
|---|---|---|---|
| HTML Parsing | HTML bytes | DOM tree | Yes - must complete for rendering |
| CSS Parsing | CSS bytes | CSSOM tree | Yes - blocks rendering |
| Style Calculation | DOM + CSSOM | Styled elements | Required for layout |
| Layout | Styled elements | Element positions/sizes | Required for paint |
| Paint | Layout tree | Paint records (draw commands) | Creates visual representation |
| Composite | Paint records + layers | Final pixels | GPU-accelerated |
The Critical Rendering Path:
Browsers render pages through a defined pipeline:
Render-Blocking Resources:
CSS is render-blocking: Browser won't render until CSSOM is complete. Put CSS in <head>, load critical CSS first.
JavaScript can block parsing: Scripts in <head> block HTML parsing unless async or defer. Put scripts at end of <body> or use async/defer.
<script src="app.js"></script> <!-- Blocks parsing -->
<script src="app.js" async></script> <!-- Doesn't block, runs when ready -->
<script src="app.js" defer></script> <!-- Doesn't block, runs after parsing -->
Resource Priorities:
Browsers prioritize resource loading:
HTTP/2 priorities let servers optimize delivery order.
Browser DevTools reveal architecture in action. The Network tab shows request timing (queued, DNS, connection, TLS, TTFB, download). The Performance tab shows the rendering pipeline. The Application tab shows storage. Use these to diagnose bottlenecks.
CDNs are geographically distributed networks of servers that cache and serve content close to users. They're fundamental to modern web performance.
The Latency Problem:
Speed of light imposes physical limits. Data center in Virginia, user in Tokyo:
For each request-response, 110ms is unavoidable physics. Multiple round-trips compound this.
CDN Solution:
CDNs place edge servers globally. User in Tokyo connects to Tokyo edge node:
Edge serves cached content immediately—no origin roundtrip. Uncached requests still go to origin, but even then, CDN backbone optimization helps.
CDN Functionality:
CDN Caching Strategy:
CDN caching is header-driven:
Cache-Control: public, max-age=31536000, immutable
public: CDN can cachemax-age=31536000: Cache for 1 yearimmutable: Don't revalidate even on refreshBest practices:
app.v2.js)no-cache (always revalidate)Cache Invalidation:
The hardest problem in CDN:
app.v2.css) → old cached, new fetchedPopular CDNs:
| CDN | Strengths |
|---|---|
| Cloudflare | Free tier, DDoS protection, Workers edge compute |
| AWS CloudFront | AWS integration, Lambda@Edge |
| Akamai | Enterprise, largest network, security |
| Fastly | Instant purge, edge compute, configuration flexibility |
| Google Cloud CDN | GCP integration, global anycast |
For most applications, CDN should handle all static assets and, where possible, cache HTML. Configure long cache TTLs with asset versioning. Edge compute (Cloudflare Workers, Lambda@Edge) can handle personalization, A/B testing, and authentication at the edge—reducing origin load dramatically.
Load balancers distribute incoming requests across multiple backend servers, enabling horizontal scaling and high availability.
Why Load Balancing?
Scalability: Single server capacity is limited. Load balancing distributes work across many servers.
Availability: If one server fails, load balancer routes to healthy servers. No single point of failure.
Maintenance: Update servers one at a time without downtime. Load balancer drains and excludes updating servers.
Performance: Route requests to least-loaded or geographically closest servers.
Load Balancing Algorithms:
| Algorithm | How It Works | Best For |
|---|---|---|
| Round Robin | Cycle through servers sequentially | Uniform servers, stateless workloads |
| Weighted Round Robin | Proportional to configured weights | Heterogeneous server capacity |
| Least Connections | Route to server with fewest active connections | Long-lived connections, varying request duration |
| Weighted Least Connections | Combines weights with connection count | Heterogeneous capacity + varying duration |
| IP Hash | Hash client IP to consistent server | Session affinity without cookies |
| Least Response Time | Route to fastest responding server | Latency-sensitive applications |
| Random | Random server selection | Simple, surprisingly effective |
Layer 4 vs Layer 7 Load Balancing:
Layer 4 (Transport Layer):
Layer 7 (Application Layer):
Layer 7 Capabilities:
# Route by path
/api/* → API servers
/static/* → Static servers
/admin/* → Admin servers
# Route by header
Host: api.example.com → API cluster
Host: www.example.com → Web cluster
# Route by cookie
session_id present → Sticky to specific server
no session_id → Round-robin
Health Checks:
Load balancers continuously verify server health:
/health)# Healthy server
/health → 200 OK
# Unhealthy server (database connection lost)
/health → 503 Service Unavailable
Session Persistence (Sticky Sessions):
Some applications require requests from the same user to reach the same server (server-side sessions). Load balancers support:
Warning: Sticky sessions reduce scaling flexibility and complicate failover. Prefer stateless designs.
Load balancers themselves must be highly available—they're on the critical path. Cloud load balancers (ALB, NLB) are inherently redundant. Self-hosted load balancers (HAProxy, Nginx) need failover pairs with virtual IPs or DNS failover.
Reverse proxies sit between clients and servers, providing a unified interface while handling cross-cutting concerns. They're distinct from (though often combined with) load balancers.
Forward Proxy vs Reverse Proxy:
Reverse Proxy Functions:
1. SSL Termination Handle TLS encryption/decryption at the proxy. Backend servers receive unencrypted traffic, simplifying their configuration and reducing CPU load.
[Client] --HTTPS--> [Reverse Proxy] --HTTP--> [App Server]
2. Compression Compress responses (gzip, Brotli) before sending to clients. App servers send uncompressed, proxy compresses.
3. Static File Serving Serve static files directly without app server involvement. Nginx is extremely efficient at serving static content.
4. Request Routing Route requests to different backends based on path, headers, or other criteria.
5. Caching Cache responses to reduce backend load. Shared cache across multiple clients.
6. Rate Limiting Limit requests per client to protect backends from abuse.
7. Request/Response Modification Add headers, rewrite URLs, modify responses.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
http { # Caching configuration proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=1g; # Compression gzip on; gzip_types text/html text/css application/javascript application/json; upstream app_servers { server app1:8080; server app2:8080; server app3:8080; } server { listen 443 ssl http2; server_name example.com; # SSL termination ssl_certificate /etc/ssl/certs/example.com.crt; ssl_certificate_key /etc/ssl/private/example.com.key; # Static files - served directly by Nginx location /static/ { root /var/www; expires 1y; add_header Cache-Control "public, immutable"; } # API routes - proxy to app servers location /api/ { proxy_pass http://app_servers; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } # Rate limiting for login endpoint location /api/login { limit_req zone=login_limit burst=5; proxy_pass http://app_servers; } }}API Gateways:
API gateways are specialized reverse proxies for API management:
| Function | Description |
|---|---|
| Authentication | Verify API keys, JWTs, OAuth tokens |
| Rate Limiting | Per-client request limits |
| Request Validation | Validate request format, schema |
| Request Transformation | Modify requests before backend |
| Response Transformation | Modify responses before client |
| Service Discovery | Find healthy backend instances |
| Circuit Breaker | Stop requests to failing services |
| Analytics | Track API usage, latency, errors |
Popular API Gateways:
Service Mesh:
In microservices architectures, sidecar proxies (Envoy, Linkerd) handle service-to-service communication:
[Service A] ←→ [Sidecar Proxy] ←→ [Sidecar Proxy] ←→ [Service B]
Sidecars provide load balancing, retries, timeouts, observability, and mTLS between services—without application code changes.
A typical production stack: [CDN Edge] → [Load Balancer] → [Reverse Proxy/Gateway] → [App Server]. Each layer handles specific concerns. CDN handles caching and geography. Load balancer handles distribution. Reverse proxy handles SSL, compression, and routing. App server handles business logic.
The terms 'web server' and 'application server' are often conflated but represent distinct roles in web architecture.
Web Server:
A web server handles HTTP protocol basics:
Examples: Nginx, Apache HTTP Server, Caddy, IIS
Pure web servers are efficient at static content but don't execute application logic.
Application Server:
An application server runs application code:
Examples depend on language:
net/httpCommon Patterns:
Concurrency Models:
How servers handle multiple simultaneous requests:
| Model | Description | Examples |
|---|---|---|
| Process per request | Fork new process for each request | Apache prefork, CGI |
| Thread per request | Pool of threads, one handles each request | Java servlets, Apache worker |
| Event loop | Single thread, non-blocking I/O | Node.js, Nginx, Envoy |
| Async/Await | Single or few threads, coroutines | Python asyncio, Go goroutines |
| Hybrid | Event loop with thread pool for blocking ops | Gunicorn workers, Uvicorn |
Event loop vs thread-per-request:
For I/O-bound web applications, event loop or async models scale better. For CPU-bound work, thread/process pools parallelize computation.
Worker Processes:
Python and Ruby commonly use worker processes:
Gunicorn master process
├── Worker 1 (handles requests)
├── Worker 2 (handles requests)
├── Worker 3 (handles requests)
└── Worker 4 (handles requests)
Each worker handles requests independently. Master manages worker lifecycle. Number of workers typically equals CPU cores.
Modern deployments often abstract servers entirely. Container orchestration (Kubernetes) manages server instances. Serverless platforms (Lambda, Cloud Functions) handle scaling automatically. The underlying web/app server concepts remain, but operational complexity shifts to platforms.
Caching is the most impactful performance optimization in web architecture. Avoiding work is faster than doing work efficiently.
Cache Layers:
Web requests traverse multiple cache layers:
Each layer has different scope, size, and invalidation characteristics.
HTTP Caching Headers:
| Header | Purpose | Example |
|---|---|---|
Cache-Control | Primary caching directive | public, max-age=3600 |
ETag | Resource version identifier | "abc123" |
Last-Modified | Last modification timestamp | Wed, 15 Jan 2025 08:00:00 GMT |
Vary | Cache varies by these request headers | Vary: Accept-Encoding, Accept-Language |
Expires | Absolute expiration (legacy) | Thu, 16 Jan 2025 08:00:00 GMT |
Cache-Control Directives:
Cache-Control: public, max-age=31536000, immutable
public: CDNs and browsers can cacheprivate: Only browser can cache (user-specific content)no-cache: Cache but revalidate before useno-store: Don't cache at allmax-age=N: Fresh for N secondss-maxage=N: Max-age for shared caches (CDN)immutable: Never revalidate (for versioned assets)stale-while-revalidate=N: Serve stale while fetching freshConditional Requests (ETag/Last-Modified):
# Initial request
GET /product/123 HTTP/1.1
# Response with ETag
HTTP/1.1 200 OK
ETag: "abc123"
Cache-Control: no-cache
...
# Later request - validate cache
GET /product/123 HTTP/1.1
If-None-Match: "abc123"
# If unchanged:
HTTP/1.1 304 Not Modified
# If changed:
HTTP/1.1 200 OK
ETag: "xyz789"
...new content...
Application Caching (Redis/Memcached):
Cache computed results to avoid repeated database queries:
def get_product(product_id):
# Try cache first
cached = redis.get(f"product:{product_id}")
if cached:
return json.loads(cached)
# Cache miss - query database
product = db.query("SELECT * FROM products WHERE id = ?", product_id)
# Populate cache for next request
redis.setex(f"product:{product_id}", 3600, json.dumps(product))
return product
Cache Invalidation Strategies:
'There are only two hard things in Computer Science: cache invalidation and naming things.' Stale caches cause subtle bugs. Use short TTLs for mutable data. Version immutable assets. Design for eventual consistency. Test cache invalidation explicitly.
Security must be considered at every layer of web architecture. Each component plays a role in defense-in-depth.
Transport Security (TLS):
HTTPS encrypts traffic between client and server:
Terminate TLS at appropriate layer:
Security Headers:
HTTP headers instruct browsers on security policies:
| Header | Purpose | Example Value |
|---|---|---|
| Strict-Transport-Security | Force HTTPS, prevent downgrade | max-age=31536000; includeSubDomains |
| Content-Security-Policy | Control resource loading, prevent XSS | default-src 'self'; script-src 'self' cdn.example.com |
| X-Content-Type-Options | Prevent MIME type sniffing | nosniff |
| X-Frame-Options | Prevent clickjacking (iframe embedding) | DENY or SAMEORIGIN |
| X-XSS-Protection | Browser XSS filter (legacy) | 1; mode=block |
| Referrer-Policy | Control referrer header leakage | strict-origin-when-cross-origin |
| Permissions-Policy | Control browser feature access | geolocation=(), camera=() |
CORS (Cross-Origin Resource Sharing):
Browsers block cross-origin requests by default. CORS headers explicitly allow them:
# Preflight request
OPTIONS /api/data HTTP/1.1
Origin: https://app.example.com
Access-Control-Request-Method: POST
# Preflight response
HTTP/1.1 200 OK
Access-Control-Allow-Origin: https://app.example.com
Access-Control-Allow-Methods: GET, POST, PUT
Access-Control-Allow-Headers: Authorization, Content-Type
Access-Control-Max-Age: 86400
# Actual request proceeds
POST /api/data HTTP/1.1
Origin: https://app.example.com
...
WAF (Web Application Firewall):
WAFs inspect HTTP traffic for attacks:
Popular WAFs: Cloudflare WAF, AWS WAF, ModSecurity
DDoS Protection:
Distributed Denial of Service attacks overwhelm infrastructure:
CDNs provide first-line DDoS protection by absorbing attack traffic at edge.
Authentication Architecture:
| Method | Description | Best For |
|---|---|---|
| Session cookies | Server-side sessions, cookie identifier | Traditional web apps |
| JWT tokens | Stateless, signed tokens | APIs, SPAs, mobile |
| OAuth 2.0 | Delegated authorization | Third-party access |
| OIDC | Identity layer on OAuth | User authentication |
| API keys | Simple token for service access | Service-to-service |
Configure security headers at the CDN or reverse proxy layer—ensure every response includes them. Use HTTPS everywhere (Let's Encrypt provides free certificates). Implement CSP gradually, starting with report-only mode. Rate limit all authentication endpoints.
We've mapped the complete web architecture—from browser to server and back, through every major infrastructure component. Let's consolidate this comprehensive picture:
The Mental Model:
When reasoning about web performance, availability, or security, trace the request path:
User → Browser → DNS → CDN Edge → Load Balancer → Reverse Proxy → App Server → Database
Each component:
Module Complete:
This concludes the HTTP Overview module. You now understand:
With this foundation, subsequent modules will dive deeper into HTTP methods, status codes, headers, HTTP/2 and HTTP/3 specifics, and HTTPS security.
Congratulations! You've completed the HTTP Overview module. You now possess a comprehensive understanding of HTTP—its purpose, mechanics, versions, and the web architecture it operates within. This knowledge is foundational for building, debugging, and optimizing any web-based system.