Loading learning content...
When a breach is discovered—and breaches are discovered, always—the first question isn't 'How did they get in?' It's 'What did they access?'
In January 2021, the Accellion breach affected dozens of organizations including law firms, universities, and government agencies. For weeks after discovery, many victims couldn't answer basic questions: Which files were accessed? Whose data was exposed? What was the scope of the breach? Organizations with comprehensive access logging answered in hours; others took months—or never fully determined the impact.
Access logging is the discipline of recording every significant data access: who touched what data, when, from where, and with what result. It transforms the invisible flow of data through systems into an auditable, reconstructable history.
By the end of this page, you'll understand what access events to log, how to capture them consistently across heterogeneous systems, how to structure access logs for efficient investigation, and how to balance comprehensive logging with performance and privacy constraints.
Access logging is the systematic recording of data access events—any action where an identity (human or machine) reads, modifies, or interacts with protected resources. This goes beyond simple authentication logging ('user logged in') to capture the complete data access story.
The Access Questions
Comprehensive access logging answers the fundamental security questions:
| Log Type | Focus | Answers | Example |
|---|---|---|---|
| Authentication Logs | Identity verification | Did user prove identity? | 'User john@example.com logged in via SSO' |
| Authorization Logs | Permission decisions | Was access allowed? | 'User granted READ on /api/users/*' |
| Access Logs | Data interaction | What data was touched? | 'User read records 1-50 from Customer table' |
| Activity Logs | Business operations | What actions occurred? | 'User approved purchase order #12345' |
| Audit Logs | Complete trail | Full compliance record | 'Comprehensive capture of all above' |
The distinction matters: knowing 'user logged in' doesn't tell you what they accessed. Knowing 'user queried customer database' is better but not precise. Knowing 'user read customer records 451-475 in the Northwest region' is actionable for breach scope assessment.
Not every data access requires logging. The scope should be driven by sensitivity classification, regulatory requirements, and forensic value. Logging everything creates noise and cost; logging too little leaves investigative gaps.
Granularity Spectrum
Access logging granularity exists on a spectrum with cost/performance implications:
| Level | Description | Use Case | Cost |
|---|---|---|---|
| Endpoint | API route accessed | Basic compliance | Low |
| Resource | Specific object/record ID | Breach scope | Medium |
| Field | Individual data fields accessed | Healthcare, financial | High |
| Value | Actual data values returned | Rarely needed, high risk | Very High |
Most organizations use resource-level logging for sensitive data and endpoint-level for general access. Field-level is required for HIPAA-critical PHI access scenarios.
A well-designed access log schema captures all information needed for forensics and compliance while remaining queryable at scale. The schema must balance completeness with practical storage and query considerations.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152
interface AccessLogEvent { // ========== Core Identification ========== id: string; // UUID v7 (time-ordered) timestamp: string; // ISO 8601 with microseconds // ========== Actor (Who) ========== actor: { // Primary identifier - must be immutable type: 'USER' | 'SERVICE' | 'SYSTEM' | 'API_KEY'; id: string; // Permanent, unique identifier // Human-readable attributes (may change over time) displayName?: string; // e.g., "John Smith" email?: string; // e.g., "john@company.com" // Authentication context authSession: { sessionId: string; // Links to authentication event authMethod: string; // 'SSO_SAML' | 'MFA_TOTP' | 'API_KEY' authTime: string; // When authentication occurred mfaVerified: boolean; // Whether MFA was used }; // Role/permissions at access time roles: string[]; // ['admin', 'hr_viewer'] effectivePermissions?: string[]; // Computed permissions used }; // ========== Source (Where From) ========== source: { ip: string; // Client IP (consider proxies) originalIp?: string; // X-Forwarded-For if applicable userAgent: string; // Browser/client identification // Device identification (if available) device?: { id: string; // Device fingerprint or registered ID type: 'DESKTOP' | 'MOBILE' | 'TABLET' | 'API_CLIENT'; trusted: boolean; // Is this a known/trusted device }; // Geolocation (privacy considerations apply) geo?: { country: string; // ISO country code region?: string; // State/province city?: string; // Only if allowed by policy coordinates?: { // Rarely needed lat: number; lon: number; accuracy: number; }; }; }; // ========== Target (What Was Accessed) ========== target: { type: 'DATABASE_RECORD' | 'FILE' | 'API_RESOURCE' | 'CONFIGURATION' | 'SECRET' | 'REPORT'; // Resource identification id: string; // Unique resource identifier collection: string; // Table, bucket, or path // For database access - record details records?: { count: number; // Number of records accessed ids?: string[]; // Specific IDs (if reasonable count) range?: { // For range queries field: string; start: string; end: string; }; }; // Data classification classification: { level: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL' | 'RESTRICTED'; categories: string[]; // ['PII', 'PHI', 'FINANCIAL'] }; // Fields accessed (for field-level logging) fields?: { requested: string[]; // Fields in the query returned: string[]; // Fields actually returned filtered: string[]; // Fields redacted by policy }; }; // ========== Access Details (How and What) ========== access: { operation: 'READ' | 'CREATE' | 'UPDATE' | 'DELETE' | 'EXPORT' | 'SHARE' | 'DOWNLOAD' | 'PRINT'; // How the access occurred channel: 'WEB_UI' | 'API' | 'MOBILE_APP' | 'CLI' | 'DATABASE_DIRECT' | 'BATCH_JOB' | 'REPORT'; // API/endpoint details endpoint?: { method: string; // GET, POST, etc. path: string; // /api/v1/customers queryParams?: object; // Non-sensitive params }; // Query details (for database access) query?: { type: 'SELECT' | 'INSERT' | 'UPDATE' | 'DELETE'; sanitizedQuery?: string; // Query with values removed executionTimeMs: number; }; // Business justification (if captured) justification?: { required: boolean; // Was break-glass required reason?: string; // User-provided reason approver?: string; // Who approved (if workflow) }; }; // ========== Result (What Happened) ========== result: { status: 'SUCCESS' | 'DENIED' | 'PARTIAL' | 'ERROR'; // For denials denialReason?: string; // 'INSUFFICIENT_PERMISSION' | 'RATE_LIMITED' policyViolated?: string; // Which policy triggered denial // For successful access recordsReturned?: number; // How many records bytesReturned?: number; // Data volume truncated?: boolean; // Was result set limited // Response time durationMs: number; }; // ========== Context (Why) ========== context: { requestId: string; // Correlation ID traceId?: string; // Distributed tracing sessionId: string; // User session // Application context application: string; // Which app generated this environment: string; // production, staging version: string; // App version // Business context workflow?: string; // Business process involved ticketId?: string; // Support ticket triggering access };}Log what was accessed, never the actual values. 'User read SSN for record 12345' is appropriate. 'User read SSN 123-45-6789' is a security disaster—you've just duplicated sensitive data into your logging system, creating a new breach surface.
Implementing comprehensive access logging requires instrumentation at multiple layers. The right pattern depends on your architecture, but most enterprise systems need logging at application, API gateway, and database layers.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
┌─────────────────────────────────────────────────────────────────────────────────┐│ USER / CLIENT │└───────────────────────────────────┬──────────────────────────────────────────────┘ │ Request with Auth Token ▼┌─────────────────────────────────────────────────────────────────────────────────┐│ API GATEWAY LAYER ││ ┌───────────────────────────────────────────────────────────────────────────┐ ││ │ Kong / AWS API Gateway / Envoy │ ││ │ │ ││ │ ACCESS LOG #1: Gateway Access Log │ ││ │ • Client IP, User-Agent │ ││ │ • Request path, method, headers │ ││ │ • Authentication result (token validated) │ ││ │ • Response code, latency │ ││ │ • Rate limiting decisions │ ││ └───────────────────────────────────────────────────────────────────────────┘ │└───────────────────────────────────┬──────────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────────────────────────┐│ APPLICATION LAYER ││ ┌───────────────────────────────────────────────────────────────────────────┐ ││ │ Node.js / Python / Java Service │ ││ │ │ ││ │ ACCESS LOG #2: Application Access Log │ ││ │ • User identity (from token claims) │ ││ │ • Business operation being performed │ ││ │ • Authorization decision (permitted?) │ ││ │ • Which resources/record IDs accessed │ ││ │ • Data classification of accessed resources │ ││ │ • Business context (workflow, ticket, reason) │ ││ │ │ ││ │ ┌────────────────────────────────────────────────────────────────────┐ │ ││ │ │ Access Logging Middleware / Interceptor │ │ ││ │ │ │ │ ││ │ │ • Wraps all data access operations │ │ ││ │ │ • Captures pre/post execution context │ │ ││ │ │ • Handles async/batch operations │ │ ││ │ │ • Standardized format across services │ │ ││ │ └────────────────────────────────────────────────────────────────────┘ │ ││ └───────────────────────────────────────────────────────────────────────────┘ │└───────────────────────────────────┬──────────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────────────────────────┐│ DATABASE LAYER ││ ┌───────────────────────────────────────────────────────────────────────────┐ ││ │ PostgreSQL / MySQL / MongoDB / S3 │ ││ │ │ ││ │ ACCESS LOG #3: Database Audit Log │ ││ │ • Actual SQL/queries executed │ ││ │ • Tables and columns accessed │ ││ │ • Rows affected/returned │ ││ │ • Connection identity │ ││ │ • Query execution time │ ││ │ │ ││ │ PostgreSQL: pgaudit extension │ ││ │ MySQL: Enterprise Audit │ ││ │ MongoDB: $auditLog │ ││ │ S3: Server Access Logging │ ││ └───────────────────────────────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────────────────────────────┘123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145
import { AccessLogger } from '@company/access-logging'; /** * Data access wrapper that ensures all access is logged */class AuditedDataAccess<T> { private accessLogger: AccessLogger; private repository: Repository<T>; constructor( repository: Repository<T>, config: { resourceType: string; classification: DataClassification; } ) { this.accessLogger = new AccessLogger(config); this.repository = repository; } async findById( id: string, context: RequestContext ): Promise<T | null> { const accessEvent = this.accessLogger.startAccess({ actor: context.authenticatedUser, source: context.requestSource, target: { type: 'DATABASE_RECORD', id: id, collection: this.repository.tableName, }, access: { operation: 'READ', channel: context.channel, }, }); try { const result = await this.repository.findById(id); // Log successful access await accessEvent.complete({ status: result ? 'SUCCESS' : 'SUCCESS', recordsReturned: result ? 1 : 0, }); return result; } catch (error) { // Log failed access await accessEvent.error(error); throw error; } } async findByQuery( query: QuerySpec, context: RequestContext ): Promise<T[]> { const accessEvent = this.accessLogger.startAccess({ actor: context.authenticatedUser, source: context.requestSource, target: { type: 'DATABASE_RECORD', collection: this.repository.tableName, // Log query parameters, not values records: { range: query.getRange(), }, }, access: { operation: 'READ', channel: context.channel, query: { type: 'SELECT', sanitizedQuery: query.toSanitizedString(), // No values! }, }, }); try { const results = await this.repository.query(query); await accessEvent.complete({ status: 'SUCCESS', recordsReturned: results.length, // Log IDs if reasonable count recordIds: results.length <= 100 ? results.map(r => r.id) : undefined, truncated: results.length >= query.limit, }); return results; } catch (error) { await accessEvent.error(error); throw error; } } /** * Bulk export requires extra logging */ async exportBulk( criteria: ExportCriteria, context: RequestContext ): Promise<ExportResult> { // Bulk operations always require enhanced logging const accessEvent = this.accessLogger.startAccess({ actor: context.authenticatedUser, source: context.requestSource, target: { type: 'DATABASE_RECORD', collection: this.repository.tableName, classification: { level: 'CONFIDENTIAL', categories: ['BULK_EXPORT'] }, }, access: { operation: 'EXPORT', channel: context.channel, justification: { required: true, reason: context.exportJustification, approver: context.exportApprover, }, }, }); const result = await this.repository.exportBulk(criteria); await accessEvent.complete({ status: 'SUCCESS', recordsReturned: result.recordCount, bytesReturned: result.sizeBytes, }); // Additional alert for bulk export await this.alertService.notify('BULK_EXPORT', { user: context.authenticatedUser, recordCount: result.recordCount, destination: result.destination, }); return result; }}Application-layer logging captures intended access, but database-layer logging captures actual access. Both are essential—they provide defense in depth and catch access that bypasses application controls (direct database connections, administrative queries).
12345678910111213141516171819202122232425262728293031
-- Enable pgaudit extensionCREATE EXTENSION IF NOT EXISTS pgaudit; -- Configure audit logging for sensitive tables-- Log all DML on customer tablesALTER TABLE customers SET (pgaudit.log = 'write, read'); -- Configure role-based auditing-- Log everything for admin roleALTER ROLE admin_role SET pgaudit.log = 'all'; -- Application role: log only reads on sensitive dataALTER ROLE app_service_role SET pgaudit.log = 'read'; -- Set session-level logging parameters-- Log all DDL and DML statementsSET pgaudit.log = 'ddl, write'; -- Log the actual parameter values (careful: PII exposure!)-- Usually disabled for sensitive dataSET pgaudit.log_parameter = off; -- Include system catalog queriesSET pgaudit.log_catalog = on; -- Log statement outcomesSET pgaudit.log_level = log; -- Example output in postgresql logs:-- AUDIT: SESSION,1,1,READ,SELECT,TABLE,public.customers,-- "SELECT id, email FROM customers WHERE id = $1"123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106
import { KinesisClient, GetRecordsCommand } from '@aws-sdk/client-kinesis';import { KMSClient, DecryptCommand } from '@aws-sdk/client-kms'; interface DatabaseActivityEvent { type: 'DatabaseActivityMonitoringRecords'; version: '1.1'; databaseActivityEvents: { logTime: string; statementId: number; substatementId: number; objectType: 'TABLE' | 'FUNCTION' | 'INDEX'; command: 'SELECT' | 'INSERT' | 'UPDATE' | 'DELETE'; objectName: string; databaseName: string; dbUserName: string; remoteHost: string; remotePort: number; sessionId: string; rowCount: number; commandText: string; // Full SQL (be careful!) paramList: string[]; errorMessage?: string; }[];} class RDSActivityMonitor { private kinesis: KinesisClient; private kms: KMSClient; private accessLogger: AccessLogIngester; async processActivityStream(shardIterator: string): Promise<void> { while (true) { const response = await this.kinesis.send(new GetRecordsCommand({ ShardIterator: shardIterator, Limit: 1000, })); for (const record of response.Records || []) { // Activity stream records are encrypted const decrypted = await this.decryptRecord(record.Data); const events = JSON.parse(decrypted) as DatabaseActivityEvent; for (const dbEvent of events.databaseActivityEvents) { await this.transformAndIngest(dbEvent); } } shardIterator = response.NextShardIterator; await sleep(100); // Rate control } } private async transformAndIngest(event: any): Promise<void> { // Transform RDS format to standard access log format const accessLog: AccessLogEvent = { id: generateUUID(), timestamp: event.logTime, actor: { type: 'SERVICE', id: event.dbUserName, authSession: { sessionId: event.sessionId, authMethod: 'DATABASE_AUTH', }, }, source: { ip: event.remoteHost, port: event.remotePort, }, target: { type: 'DATABASE_RECORD', id: event.objectName, collection: `${event.databaseName}.${event.objectName}`, }, access: { operation: this.mapCommand(event.command), channel: 'DATABASE_DIRECT', query: { type: event.command, sanitizedQuery: this.sanitizeQuery(event.commandText), executionTimeMs: event.latency, }, }, result: { status: event.errorMessage ? 'ERROR' : 'SUCCESS', recordsReturned: event.rowCount, }, }; await this.accessLogger.ingest(accessLog); } private sanitizeQuery(query: string): string { // Remove literal values to prevent PII logging return query .replace(/'[^']*'/g, '?') // Remove string literals .replace(/\d+/g, '?') // Remove numeric literals .replace(/\s+/g, ' ') // Normalize whitespace .trim(); }}APIs are the primary interface for modern applications. Comprehensive API access logging captures who calls which endpoints, with what parameters, and what data is returned—without logging sensitive values.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141
import { Request, Response, NextFunction } from 'express';import { AccessLogger, AccessLogEvent } from '@company/audit'; interface AccessLoggingConfig { // Fields to exclude from logging (e.g., password, token) sensitiveParams: string[]; sensitiveHeaders: string[]; // Endpoints with special handling excludedPaths: string[]; // Don't log these (health checks) bulkOperationPaths: string[]; // Enhanced logging for these // Response body logging (careful!) logResponseBody: boolean; // Usually false responseBodyMaxSize: number; // Truncate if enabled} function createAccessLoggingMiddleware( accessLogger: AccessLogger, config: AccessLoggingConfig): RequestHandler { return async (req: Request, res: Response, next: NextFunction) => { // Skip excluded paths if (config.excludedPaths.some(p => req.path.startsWith(p))) { return next(); } const startTime = Date.now(); const requestId = req.headers['x-request-id'] as string || generateUUID(); // Capture original write to intercept response const originalWrite = res.write; const originalEnd = res.end; const chunks: Buffer[] = []; if (config.logResponseBody) { res.write = function(chunk: any, ...args: any[]): boolean { chunks.push(Buffer.from(chunk)); return originalWrite.apply(res, [chunk, ...args]); }; res.end = function(chunk: any, ...args: any[]): Response { if (chunk) chunks.push(Buffer.from(chunk)); return originalEnd.apply(res, [chunk, ...args]); }; } // Create access log event res.on('finish', async () => { const duration = Date.now() - startTime; const accessEvent: AccessLogEvent = { id: generateUUID(), timestamp: new Date().toISOString(), actor: extractActor(req), source: extractSource(req), target: { type: 'API_RESOURCE', id: req.path, collection: extractResourceCollection(req.path), }, access: { operation: mapHttpMethodToOperation(req.method), channel: 'API', endpoint: { method: req.method, path: req.path, queryParams: sanitizeParams(req.query, config.sensitiveParams), }, }, result: { status: res.statusCode < 400 ? 'SUCCESS' : res.statusCode === 403 ? 'DENIED' : 'ERROR', durationMs: duration, bytesReturned: parseInt(res.get('content-length') || '0'), // Extract record count from response if available recordsReturned: extractRecordCount(res), }, context: { requestId, traceId: req.headers['x-trace-id'] as string, sessionId: req.session?.id, application: 'api-gateway', environment: process.env.NODE_ENV || 'development', version: process.env.APP_VERSION || 'unknown', }, }; // Enhanced logging for bulk operations if (config.bulkOperationPaths.some(p => req.path.startsWith(p))) { accessEvent.access.operation = 'EXPORT'; accessEvent.target.classification = { level: 'CONFIDENTIAL', categories: ['BULK_OPERATION'], }; } await accessLogger.log(accessEvent); }); next(); };} // Helper functionsfunction extractActor(req: Request): AccessLogEvent['actor'] { const user = req.user; // From authentication middleware return { type: user?.serviceAccount ? 'SERVICE' : 'USER', id: user?.id || 'anonymous', displayName: user?.name, email: user?.email, authSession: { sessionId: req.session?.id || 'no-session', authMethod: req.authInfo?.method || 'UNKNOWN', authTime: req.authInfo?.authTime, mfaVerified: req.authInfo?.mfaVerified || false, }, roles: user?.roles || [], };} function sanitizeParams( params: object, sensitiveFields: string[]): object { const sanitized = { ...params }; for (const field of sensitiveFields) { if (field in sanitized) { sanitized[field] = '[REDACTED]'; } } return sanitized;}Individual access logs are data points; correlated access logs tell stories. Effective forensics requires connecting logs across layers, services, and time to reconstruct complete access patterns.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105
interface AccessInvestigationService { /** * What did a user access during breach window? */ async getUserAccessHistory( userId: string, startTime: Date, endTime: Date ): Promise<AccessSummary>; /** * Who accessed a specific record? */ async getRecordAccessHistory( recordId: string, options?: { limit?: number; includeServiceAccess?: boolean } ): Promise<AccessorList>; /** * Reconstruct a complete request journey */ async traceRequest(requestId: string): Promise<RequestJourney>;} // Example: OpenSearch/Elasticsearch queries for investigation class AccessInvestigator { private esClient: ElasticsearchClient; /** * Breach scope assessment: What sensitive data did this user access? */ async assessBreachScope( compromisedUserId: string, breachStart: Date, breachEnd: Date ): Promise<BreachScopeReport> { const response = await this.esClient.search({ index: 'access-logs-*', body: { size: 0, query: { bool: { must: [ { term: { 'actor.id': compromisedUserId } }, { range: { timestamp: { gte: breachStart, lte: breachEnd } } }, { terms: { 'target.classification.categories': ['PII', 'PHI', 'FINANCIAL'] } }, { term: { 'result.status': 'SUCCESS' } }, ], }, }, aggs: { by_classification: { terms: { field: 'target.classification.categories' }, }, by_resource: { terms: { field: 'target.collection', size: 100 }, aggs: { unique_records: { cardinality: { field: 'target.id' }, }, sample_ids: { terms: { field: 'target.id', size: 10 }, }, }, }, by_operation: { terms: { field: 'access.operation' }, }, unique_records_accessed: { cardinality: { field: 'target.id' }, }, access_timeline: { date_histogram: { field: 'timestamp', calendar_interval: 'hour', }, }, }, }, }); return this.formatBreachReport(response.aggregations); } /** * Detect anomalous access patterns */ async detectAnomalies(options: AnomalyDetectionOptions): Promise<Anomaly[]> { // Volume-based: unusual access count const volumeAnomalies = await this.detectVolumeAnomalies(options); // Time-based: access outside normal hours const timeAnomalies = await this.detectTimeAnomalies(options); // Geography-based: access from unusual locations const geoAnomalies = await this.detectGeoAnomalies(options); // Behavior-based: unusual resources accessed const behaviorAnomalies = await this.detectBehaviorAnomalies(options); return [...volumeAnomalies, ...timeAnomalies, ...geoAnomalies, ...behaviorAnomalies]; }}Access logging is the heartbeat of security observability. Without it, breaches are invisible, forensics are impossible, and compliance is unverifiable. With comprehensive access logging, organizations can answer the most critical security question: 'What data was accessed?'
What's Next
With access logging in place, we complete the module with compliance reporting—transforming raw audit and access data into the reports, dashboards, and attestations that auditors require. You'll learn how to automate evidence generation, demonstrate continuous compliance, and respond effectively to audit requests.
You now understand how to implement comprehensive access logging across application layers. From schema design through multi-layer implementation to correlation and analysis, you can build systems that answer the critical forensic question: 'What did they access?'