Loading content...
Traditional logging produces output like this:
2024-01-15 10:23:45 INFO OrderService - Processing order 12345 for user john@example.com, total $149.99
2024-01-15 10:23:46 ERROR PaymentProcessor - Payment failed for order 12345: card declined (insufficient funds)
This looks readable to humans, but it's a nightmare for machines. Want to find all orders over $100? You need regex. Want to correlate by user email? More regex. Want to count payment failures by decline reason? Even more complex pattern matching.
Now consider the same information as structured logs:
{"timestamp":"2024-01-15T10:23:45.847Z","level":"INFO","event":"ORDER_PROCESSING_STARTED","orderId":"12345","userId":"usr_abc123","userEmail":"john@example.com","totalAmount":149.99,"currency":"USD"}
{"timestamp":"2024-01-15T10:23:46.234Z","level":"ERROR","event":"PAYMENT_FAILED","orderId":"12345","errorCode":"card_declined","declineReason":"insufficient_funds"}
Now you can query: event:PAYMENT_FAILED AND declineReason:insufficient_funds to find all insufficient funds declines. You can aggregate by declineReason to see failure distributions. You can join on orderId to trace complete order flows. The structure enables analysis that was previously impossible.
By the end of this page, you will understand why structured logging is essential at scale, how to design effective log schemas, implement structured logging in major languages, and avoid common pitfalls that undermine the benefits.
Structured logging is the practice of outputting log entries as structured data—typically JSON—rather than free-form text strings. This seemingly simple change has profound implications for how you can use your logs.
errorCode:TIMEOUT AND service:payment| Task | Unstructured Approach | Structured Approach |
|---|---|---|
| Find all errors for user X | grep + regex extraction + hope format is consistent | level:ERROR AND userId:X |
| Count orders by status | Parse logs, extract status field, aggregate | event:ORDER_* | stats count by status |
| Calculate p99 latency | Parse duration from log text, convert units, calculate | durationMs:* | percentile(durationMs, 99) |
| Find slow database queries | grep 'database' + parse duration + compare threshold | event:DATABASE_QUERY AND durationMs:>100 |
| Correlate request to response | Match request ID across logs manually | correlationId:abc123 | sort timestamp |
For a small application, unstructured logs might suffice. But as soon as you have multiple services, high volume, or need to investigate complex issues, structured logging becomes essential. Start structured from day one—retrofitting is painful.
While structured logging can use various formats (logfmt, XML, etc.), JSON has become the de facto standard for several reasons:
1234567891011121314151617
// Each log entry is a single line of JSON (JSON Lines format)// Basic structure: timestamp, level, event, then context-specific fields // Application startup{"timestamp":"2024-01-15T10:00:00.000Z","level":"INFO","event":"SERVICE_STARTED","service":"order-service","version":"2.4.1","environment":"production","instanceId":"i-0abc123","startupTimeMs":3247} // Business operation{"timestamp":"2024-01-15T10:00:15.123Z","level":"INFO","event":"ORDER_CREATED","correlationId":"req-xyz-789","userId":"usr-abc-123","orderId":"ord-456","itemCount":3,"totalAmount":149.99,"currency":"USD","source":"web"} // External call with timing{"timestamp":"2024-01-15T10:00:15.347Z","level":"INFO","event":"PAYMENT_PROCESSED","correlationId":"req-xyz-789","orderId":"ord-456","gateway":"stripe","transactionId":"txn-stripe-999","amount":149.99,"durationMs":224,"responseCode":"approved"} // Error with full context{"timestamp":"2024-01-15T10:05:23.847Z","level":"ERROR","event":"PAYMENT_FAILED","correlationId":"req-abc-456","orderId":"ord-789","gateway":"stripe","amount":299.99,"errorCode":"card_declined","declineReason":"insufficient_funds","cardLast4":"4242","retryable":false,"stack":"at PaymentProcessor.charge (payment.js:47)\n at OrderService.process (order.js:123)"} // Nested structure for complex data{"timestamp":"2024-01-15T10:10:00.000Z","level":"INFO","event":"BATCH_COMPLETED","batchId":"batch-123","stats":{"processed":1247,"failed":3,"skipped":12},"timing":{"totalMs":45230,"avgPerItemMs":36},"failures":[{"itemId":"item-1","error":"validation_failed"},{"itemId":"item-2","error":"duplicate_key"},{"itemId":"item-3","error":"timeout"}]}JSON Lines Format:
Notice that each log entry is a single line of JSON. This is the JSON Lines (jsonl) format, and it's critical for log processing:
jq, grep, and log aggregators expect single-line JSONNever output pretty-printed (multi-line) JSON in production logs.
Multi-line JSON breaks log processing pipelines. Newlines within string values must be escaped (\n). Stack traces should be a single escaped string field, not multiple log lines.
While JSON is flexible, consistency is crucial. A well-designed log schema ensures that logs across your organization can be queried, aggregated, and analyzed together.
2024-01-15T10:23:45.847Z123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081
/** * Core log schema defining required and optional fields. * Use this as a contract across all services. */ // Core fields present in every log entryinterface LogEntryCore { // ISO 8601 timestamp with timezone timestamp: string; // Standard log level level: 'TRACE' | 'DEBUG' | 'INFO' | 'WARN' | 'ERROR' | 'FATAL'; // Event type identifier - SCREAMING_SNAKE_CASE event: string; // Service that produced the log service: string; // Service version (semantic version) version: string;} // Contextual fields (recommend for all logs)interface LogContext { // Request correlation ID correlationId?: string; // Deployment environment environment?: 'production' | 'staging' | 'development'; // Instance identification hostname?: string; instanceId?: string; // Actor information userId?: string; sessionId?: string;} // Timing information (for operations)interface LogTiming { // Operation duration in milliseconds durationMs?: number; // Specific phase timings phases?: Record<string, number>;} // Error information (for ERROR/FATAL levels)interface LogError { error?: { message: string; type: string; code?: string; stack?: string; cause?: string; };} // Complete log entry typetype LogEntry = LogEntryCore & LogContext & LogTiming & LogError & { // Additional event-specific fields [key: string]: unknown;}; // Example usage with type safetyconst logEntry: LogEntry = { timestamp: new Date().toISOString(), level: 'INFO', event: 'ORDER_CREATED', service: 'order-service', version: '2.4.1', correlationId: 'req-abc-123', environment: 'production', userId: 'usr-xyz-789', orderId: 'ord-456', totalAmount: 149.99, itemCount: 3, durationMs: 47};Consider using a shared logging library across services that enforces the core schema. This prevents drift and ensures queryability. The library can add common fields automatically (timestamp, service, version, hostname).
Consistent naming conventions make logs queryable and understandable across teams. Establish standards early and enforce them through shared libraries and code review.
| Category | Convention | Examples |
|---|---|---|
| Field names | camelCase | userId, orderId, totalAmount, durationMs |
| Event names | SCREAMING_SNAKE_CASE | ORDER_CREATED, PAYMENT_FAILED, USER_LOGIN |
| Boolean fields | Positive phrasing, no is prefix in JSON | enabled, retryable, success (not isEnabled) |
| IDs | Include type in name | userId, orderId, transactionId (not just id) |
| Durations | Include unit in name | durationMs, timeoutSeconds, cacheTtlMinutes |
| Counts | Include 'Count' suffix | itemCount, retryCount, errorCount |
| Amounts | Include currency separately | amount: 149.99, currency: "USD" |
id (which ID?), value (what value?), data (what data?)user_id, userId, UserId across servicesamt, qty, msg — use full words for claritytimeout: 30 (seconds? milliseconds? minutes?)notFound, disabled — harder to reason about1234567891011121314151617
{ "ts": "2024-01-15T10:00:00Z", "lvl": "err", "msg": "failed", "id": "123", "amt": 149.99, "dur": 250, "is_success": false, "retry_count": 3} // Problems:// - Abbreviations (ts, lvl, amt, dur)// - Ambiguous id// - Missing units on dur// - Inconsistent casing// - Negative boolean phrasing123456789101112131415161718
{ "timestamp": "2024-01-15T10:00:00Z", "level": "ERROR", "event": "PAYMENT_FAILED", "orderId": "ord-123", "amount": 149.99, "currency": "USD", "durationMs": 250, "success": false, "retryCount": 3} // Clear:// - Full field names// - Typed ID (orderId)// - Unit in name (durationMs)// - camelCase consistency// - Positive booleanMost modern logging frameworks support structured output natively or via configuration. Here's how to implement structured logging in common languages:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
// Using SLF4J with Logback and logstash-logback-encoder // 1. Add dependency: net.logstash.logback:logstash-logback-encoder // 2. Configure logback.xml for JSON output/*<configuration> <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender"> <encoder class="net.logstash.logback.encoder.LogstashEncoder"> <includeMdcKeyName>correlationId</includeMdcKeyName> <includeMdcKeyName>userId</includeMdcKeyName> <customFields>{"service":"order-service","version":"2.4.1"}</customFields> </encoder> </appender> <root level="INFO"> <appender-ref ref="STDOUT" /> </root></configuration>*/ // 3. Use structured logging in codeimport net.logstash.logback.argument.StructuredArguments;import static net.logstash.logback.argument.StructuredArguments.*; public class OrderService { private static final Logger logger = LoggerFactory.getLogger(OrderService.class); public Order createOrder(CreateOrderRequest request) { // Set MDC context (included in all logs) MDC.put("correlationId", request.getCorrelationId()); MDC.put("userId", request.getUserId()); try { long startTime = System.currentTimeMillis(); Order order = processOrder(request); // Structured logging with key-value pairs logger.info("ORDER_CREATED", kv("event", "ORDER_CREATED"), kv("orderId", order.getId()), kv("itemCount", order.getItems().size()), kv("totalAmount", order.getTotalAmount()), kv("currency", order.getCurrency()), kv("durationMs", System.currentTimeMillis() - startTime)); return order; } catch (Exception e) { logger.error("ORDER_CREATION_FAILED", kv("event", "ORDER_CREATION_FAILED"), kv("errorCode", e instanceof BusinessException ? ((BusinessException) e).getCode() : "UNKNOWN"), kv("errorMessage", e.getMessage()), e); throw e; } finally { MDC.clear(); } }} /** * Output: * {"@timestamp":"2024-01-15T10:00:00.000Z","level":"INFO","logger":"OrderService", * "message":"ORDER_CREATED","correlationId":"req-123","userId":"usr-456", * "service":"order-service","version":"2.4.1","event":"ORDER_CREATED", * "orderId":"ord-789","itemCount":3,"totalAmount":149.99,"currency":"USD", * "durationMs":47} */Structured logging can represent complex, nested data structures. This is powerful but requires thoughtful design to maintain queryability.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
// Nested objects for related datalogger.info({ event: 'BATCH_PROCESSING_COMPLETE', batchId: 'batch-123', // Nested object for aggregate stats stats: { processed: 1247, failed: 3, skipped: 12, successRate: 99.76 }, // Nested object for timing breakdown timing: { totalMs: 45230, avgPerItemMs: 36, p99Ms: 127, phases: { fetchMs: 5000, transformMs: 35000, persistMs: 5230 } }, // Array of failure details (keep small!) failures: [ { itemId: 'item-1', error: 'validation_failed', field: 'email' }, { itemId: 'item-2', error: 'duplicate_key', key: 'user-123' }, { itemId: 'item-3', error: 'timeout', operation: 'external_api' } ]}); // Output enables queries like:// - stats.successRate:>95// - timing.phases.transformMs:>30000// - failures.error:timeout // ⚠️ CAUTION: Keep nested structures shallow// Deep nesting (>3 levels) becomes hard to query // ❌ BAD: Deeply nested, hard to querylogger.info({ event: 'ORDER_PLACED', order: { customer: { address: { shipping: { city: 'Seattle' // 4 levels deep - hard to query! } } } }}); // ✅ GOOD: Flattened for queryabilitylogger.info({ event: 'ORDER_PLACED', orderId: 'ord-123', customerId: 'cust-456', shippingCity: 'Seattle', shippingState: 'WA', shippingCountry: 'US'});Use nesting to organize related fields logically (stats, timing, error), but avoid deep hierarchies. For fields you'll query frequently, consider flattening to top level with clear prefixes: shippingCity instead of address.shipping.city.
Even with structured logging, there are ways to undermine its benefits. Watch out for these common mistakes:
event: 'Order ' + orderId + ' created' defeats the purpose. Use event: 'ORDER_CREATED', orderId: orderIduserId: '123', sometimes userId: 123. Pick one (string for IDs) and stick with it.userId vs user_id vs UserId across services. Standardize and enforce.123456789101112131415161718192021
// ❌ String concatenationlogger.info({ message: `Order ${orderId} created for user ${userId}`, amount: `$${amount}`}); // ❌ Inconsistent typeslogger.info({ userId: 123 }); // numberlogger.info({ userId: '456' }); // string // ❌ Unbounded arraylogger.info({ event: 'BATCH_COMPLETE', processedIds: allIds // could be 10,000 items!}); // ❌ Massive object dumplogger.info({ event: 'REQUEST_RECEIVED', body: request.body // could be megabytes});123456789101112131415161718192021222324252627
// ✅ Structured fieldslogger.info({ event: 'ORDER_CREATED', orderId: orderId, userId: userId, amount: amount, currency: 'USD'}); // ✅ Consistent string type for IDslogger.info({ userId: '123' });logger.info({ userId: '456' }); // ✅ Bounded summarylogger.info({ event: 'BATCH_COMPLETE', processedCount: allIds.length, sampleIds: allIds.slice(0, 5)}); // ✅ Extract relevant fieldslogger.info({ event: 'REQUEST_RECEIVED', contentType: request.contentType, contentLength: request.body.length, bodyHash: hash(request.body)});Structured logs shine when integrated with modern log management platforms. Here's how they enable powerful analysis:
| Platform | Key Features | Query Example |
|---|---|---|
| Elasticsearch/OpenSearch | Full-text + structured search, aggregations | event:PAYMENT_* AND durationMs:>1000 |
| Datadog | Live tail, pattern recognition, alerting | @event:ORDER_CREATED @totalAmount:>100 |
| Splunk | SPL queries, dashboards, ML | event=PAYMENT_FAILED | stats count by declineReason |
| Grafana Loki | Label-based, Prometheus-like | {service="order-service"} |= "ERROR" |
| AWS CloudWatch | Insights queries, alarms | fields event, durationMs | filter level="ERROR" |
Dashboard and Alerting Capabilities:
With structured logs, you can build dashboards showing:
You can alert on:
None of this is practical with unstructured logs—you'd need complex regex that breaks every time the format changes.
Log platforms have limits on unique field names and cardinality. Avoid creating fields with unbounded values (like logging every unique user ID as a field name). Use fields for structure, values for data.
Let's consolidate the key insights about structured logging:
Module Complete:
You've now completed the comprehensive module on Logging in LLD. You understand:
With this knowledge, you can design logging that transforms your system from a black box into an observable, debuggable, measurable platform.
Congratulations! You've mastered Logging in LLD. You can now design and implement logging strategies that provide comprehensive observability, enable rapid debugging, and support operational excellence. Next, explore Observability Design to learn about metrics, tracing, and building fully observable systems.