System Design (HLD)Audit and Logging for Compliance

Audit and Logging for Compliance

LevelAdvanced

Duration60 mins

TopicAudit and Logging for Compliance

4 / 5

Access Logging

The Question That Matters Most

When a breach is discovered—and breaches are discovered, always—the first question isn't 'How did they get in?' It's 'What did they access?'

In January 2021, the Accellion breach affected dozens of organizations including law firms, universities, and government agencies. For weeks after discovery, many victims couldn't answer basic questions: Which files were accessed? Whose data was exposed? What was the scope of the breach? Organizations with comprehensive access logging answered in hours; others took months—or never fully determined the impact.

Access logging is the discipline of recording every significant data access: who touched what data, when, from where, and with what result. It transforms the invisible flow of data through systems into an auditable, reconstructable history.

What You Will Learn

By the end of this page, you'll understand what access events to log, how to capture them consistently across heterogeneous systems, how to structure access logs for efficient investigation, and how to balance comprehensive logging with performance and privacy constraints.

Defining Access Logging

Access logging is the systematic recording of data access events—any action where an identity (human or machine) reads, modifies, or interacts with protected resources. This goes beyond simple authentication logging ('user logged in') to capture the complete data access story.

The Access Questions

Comprehensive access logging answers the fundamental security questions:

Who: Which identity performed the access? Human user? Service account? API key?
What: What resource was accessed? Specific records? Files? API endpoints?
When: Precise timestamp with sub-second granularity
Where: From what location? IP address, device, geographic region
How: Through what mechanism? Web UI, API, direct database, export?
Why: What was the business context? Request ID, session context, stated purpose
Result: Was access granted or denied? Full or partial? What was returned?

Access Logging vs. Related Logging Types
Log Type	Focus	Answers	Example
Authentication Logs	Identity verification	Did user prove identity?	'User john@example.com logged in via SSO'
Authorization Logs	Permission decisions	Was access allowed?	'User granted READ on /api/users/*'
Access Logs	Data interaction	What data was touched?	'User read records 1-50 from Customer table'
Activity Logs	Business operations	What actions occurred?	'User approved purchase order #12345'
Audit Logs	Complete trail	Full compliance record	'Comprehensive capture of all above'

Access Logging is Granular

The distinction matters: knowing 'user logged in' doesn't tell you what they accessed. Knowing 'user queried customer database' is better but not precise. Knowing 'user read customer records 451-475 in the Northwest region' is actionable for breach scope assessment.

Defining Access Log Scope

Not every data access requires logging. The scope should be driven by sensitivity classification, regulatory requirements, and forensic value. Logging everything creates noise and cost; logging too little leaves investigative gaps.

Always Log These Accesses

•All access to PII (names, SSN, addresses)
•All access to PHI (medical records, diagnoses)
•All access to financial data (accounts, transactions)
•All access to credentials or secrets
•All access to source code and IP
•All bulk data operations (exports, reports)
•All administrative/privileged access
•All API access from external parties
•All access denied events
•All access to audit logs themselves

Typically Excluded

•Public, read-only content
•Static assets (CSS, images)
•Health check endpoints
•Caching layer hits
•Internal service-to-service heartbeats
•Anonymous aggregate metrics
•Very high-volume, low-value reads
•Development/test environment access
•Automated monitoring queries
•CDN and edge cache operations

Granularity Spectrum

Access logging granularity exists on a spectrum with cost/performance implications:

Level	Description	Use Case	Cost
Endpoint	API route accessed	Basic compliance	Low
Resource	Specific object/record ID	Breach scope	Medium
Field	Individual data fields accessed	Healthcare, financial	High
Value	Actual data values returned	Rarely needed, high risk	Very High

Most organizations use resource-level logging for sensitive data and endpoint-level for general access. Field-level is required for HIPAA-critical PHI access scenarios.

Access Log Event Schema

A well-designed access log schema captures all information needed for forensics and compliance while remaining queryable at scale. The schema must balance completeness with practical storage and query considerations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
interface AccessLogEvent {
  // ========== Core Identification ==========
  id: string;                    // UUID v7 (time-ordered)
  timestamp: string;             // ISO 8601 with microseconds
  
  // ========== Actor (Who) ==========
  actor: {
    // Primary identifier - must be immutable
    type: 'USER' | 'SERVICE' | 'SYSTEM' | 'API_KEY';
    id: string;                  // Permanent, unique identifier
    
    // Human-readable attributes (may change over time)
    displayName?: string;        // e.g., "John Smith"
    email?: string;              // e.g., "john@company.com"
    
    // Authentication context
    authSession: {
      sessionId: string;         // Links to authentication event
      authMethod: string;        // 'SSO_SAML' | 'MFA_TOTP' | 'API_KEY'
      authTime: string;          // When authentication occurred
      mfaVerified: boolean;      // Whether MFA was used
    };
    
    // Role/permissions at access time
    roles: string[];             // ['admin', 'hr_viewer']
    effectivePermissions?: string[]; // Computed permissions used
  };
  
  // ========== Source (Where From) ==========
  source: {
    ip: string;                  // Client IP (consider proxies)
    originalIp?: string;         // X-Forwarded-For if applicable
    userAgent: string;           // Browser/client identification
    
    // Device identification (if available)
    device?: {
      id: string;                // Device fingerprint or registered ID
      type: 'DESKTOP' | 'MOBILE' | 'TABLET' | 'API_CLIENT';
      trusted: boolean;          // Is this a known/trusted device
    };
    
    // Geolocation (privacy considerations apply)
    geo?: {
      country: string;           // ISO country code
      region?: string;           // State/province
      city?: string;             // Only if allowed by policy
      coordinates?: {            // Rarely needed
        lat: number;
        lon: number;
        accuracy: number;
      };
    };
  };
  
  // ========== Target (What Was Accessed) ==========
  target: {
    type: 'DATABASE_RECORD' | 'FILE' | 'API_RESOURCE' | 
          'CONFIGURATION' | 'SECRET' | 'REPORT';
    
    // Resource identification
    id: string;                  // Unique resource identifier
    collection: string;          // Table, bucket, or path
    
    // For database access - record details
    records?: {
      count: number;             // Number of records accessed
      ids?: string[];            // Specific IDs (if reasonable count)
      range?: {                  // For range queries
        field: string;
        start: string;
        end: string;
      };
    };
    
    // Data classification
    classification: {
      level: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL' | 'RESTRICTED';
      categories: string[];      // ['PII', 'PHI', 'FINANCIAL']
    };
    
    // Fields accessed (for field-level logging)
    fields?: {
      requested: string[];       // Fields in the query
      returned: string[];        // Fields actually returned
      filtered: string[];        // Fields redacted by policy
    };
  };
  
  // ========== Access Details (How and What) ==========
  access: {
    operation: 'READ' | 'CREATE' | 'UPDATE' | 'DELETE' | 
               'EXPORT' | 'SHARE' | 'DOWNLOAD' | 'PRINT';
    
    // How the access occurred
    channel: 'WEB_UI' | 'API' | 'MOBILE_APP' | 'CLI' | 
             'DATABASE_DIRECT' | 'BATCH_JOB' | 'REPORT';
    
    // API/endpoint details
    endpoint?: {
      method: string;            // GET, POST, etc.
      path: string;              // /api/v1/customers
      queryParams?: object;      // Non-sensitive params
    };
    
    // Query details (for database access)
    query?: {
      type: 'SELECT' | 'INSERT' | 'UPDATE' | 'DELETE';
      sanitizedQuery?: string;   // Query with values removed
      executionTimeMs: number;
    };
    
    // Business justification (if captured)
    justification?: {
      required: boolean;         // Was break-glass required
      reason?: string;           // User-provided reason
      approver?: string;         // Who approved (if workflow)
    };
  };
  
  // ========== Result (What Happened) ==========
  result: {
    status: 'SUCCESS' | 'DENIED' | 'PARTIAL' | 'ERROR';
    
    // For denials
    denialReason?: string;       // 'INSUFFICIENT_PERMISSION' | 'RATE_LIMITED'
    policyViolated?: string;     // Which policy triggered denial
    
    // For successful access
    recordsReturned?: number;    // How many records
    bytesReturned?: number;      // Data volume
    truncated?: boolean;         // Was result set limited
    
    // Response time
    durationMs: number;
  };
  
  // ========== Context (Why) ==========
  context: {
    requestId: string;           // Correlation ID
    traceId?: string;            // Distributed tracing
    sessionId: string;           // User session
    
    // Application context
    application: string;         // Which app generated this
    environment: string;         // production, staging
    version: string;             // App version
    
    // Business context
    workflow?: string;           // Business process involved
    ticketId?: string;           // Support ticket triggering access
  };
}

Never Log the Data Values

Log what was accessed, never the actual values. 'User read SSN for record 12345' is appropriate. 'User read SSN 123-45-6789' is a security disaster—you've just duplicated sensitive data into your logging system, creating a new breach surface.

Access Logging Implementation

Implementing comprehensive access logging requires instrumentation at multiple layers. The right pattern depends on your architecture, but most enterprise systems need logging at application, API gateway, and database layers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
┌─────────────────────────────────────────────────────────────────────────────────┐
│                               USER / CLIENT                                      │
└───────────────────────────────────┬──────────────────────────────────────────────┘
                                    │ Request with Auth Token
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                            API GATEWAY LAYER                                     │
│  ┌───────────────────────────────────────────────────────────────────────────┐  │
│  │                     Kong / AWS API Gateway / Envoy                         │  │
│  │                                                                            │  │
│  │  ACCESS LOG #1: Gateway Access Log                                         │  │
│  │  • Client IP, User-Agent                                                   │  │
│  │  • Request path, method, headers                                           │  │
│  │  • Authentication result (token validated)                                 │  │
│  │  • Response code, latency                                                  │  │
│  │  • Rate limiting decisions                                                 │  │
│  └───────────────────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────┬──────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                            APPLICATION LAYER                                     │
│  ┌───────────────────────────────────────────────────────────────────────────┐  │
│  │                     Node.js / Python / Java Service                        │  │
│  │                                                                            │  │
│  │  ACCESS LOG #2: Application Access Log                                     │  │
│  │  • User identity (from token claims)                                       │  │
│  │  • Business operation being performed                                      │  │
│  │  • Authorization decision (permitted?)                                     │  │
│  │  • Which resources/record IDs accessed                                     │  │
│  │  • Data classification of accessed resources                               │  │
│  │  • Business context (workflow, ticket, reason)                             │  │
│  │                                                                            │  │
│  │  ┌────────────────────────────────────────────────────────────────────┐   │  │
│  │  │  Access Logging Middleware / Interceptor                            │   │  │
│  │  │                                                                     │   │  │
│  │  │  • Wraps all data access operations                                │   │  │
│  │  │  • Captures pre/post execution context                             │   │  │
│  │  │  • Handles async/batch operations                                  │   │  │
│  │  │  • Standardized format across services                             │   │  │
│  │  └────────────────────────────────────────────────────────────────────┘   │  │
│  └───────────────────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────┬──────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                            DATABASE LAYER                                        │
│  ┌───────────────────────────────────────────────────────────────────────────┐  │
│  │                PostgreSQL / MySQL / MongoDB / S3                           │  │
│  │                                                                            │  │
│  │  ACCESS LOG #3: Database Audit Log                                         │  │
│  │  • Actual SQL/queries executed                                             │  │
│  │  • Tables and columns accessed                                             │  │
│  │  • Rows affected/returned                                                  │  │
│  │  • Connection identity                                                     │  │
│  │  • Query execution time                                                    │  │
│  │                                                                            │  │
│  │  PostgreSQL: pgaudit extension                                             │  │
│  │  MySQL: Enterprise Audit                                                   │  │
│  │  MongoDB: $auditLog                                                        │  │
│  │  S3: Server Access Logging                                                 │  │
│  └───────────────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────────┘

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
import { AccessLogger } from '@company/access-logging';
 
/**
 * Data access wrapper that ensures all access is logged
 */
class AuditedDataAccess<T> {
  private accessLogger: AccessLogger;
  private repository: Repository<T>;
  
  constructor(
    repository: Repository<T>,
    config: {
      resourceType: string;
      classification: DataClassification;
    }
  ) {
    this.accessLogger = new AccessLogger(config);
    this.repository = repository;
  }
  
  async findById(
    id: string,
    context: RequestContext
  ): Promise<T | null> {
    const accessEvent = this.accessLogger.startAccess({
      actor: context.authenticatedUser,
      source: context.requestSource,
      target: {
        type: 'DATABASE_RECORD',
        id: id,
        collection: this.repository.tableName,
      },
      access: {
        operation: 'READ',
        channel: context.channel,
      },
    });
    
    try {
      const result = await this.repository.findById(id);
      
      // Log successful access
      await accessEvent.complete({
        status: result ? 'SUCCESS' : 'SUCCESS',
        recordsReturned: result ? 1 : 0,
      });
      
      return result;
    } catch (error) {
      // Log failed access
      await accessEvent.error(error);
      throw error;
    }
  }
  
  async findByQuery(
    query: QuerySpec,
    context: RequestContext
  ): Promise<T[]> {
    const accessEvent = this.accessLogger.startAccess({
      actor: context.authenticatedUser,
      source: context.requestSource,
      target: {
        type: 'DATABASE_RECORD',
        collection: this.repository.tableName,
        // Log query parameters, not values
        records: {
          range: query.getRange(),
        },
      },
      access: {
        operation: 'READ',
        channel: context.channel,
        query: {
          type: 'SELECT',
          sanitizedQuery: query.toSanitizedString(), // No values!
        },
      },
    });
    
    try {
      const results = await this.repository.query(query);
      
      await accessEvent.complete({
        status: 'SUCCESS',
        recordsReturned: results.length,
        // Log IDs if reasonable count
        recordIds: results.length <= 100 
          ? results.map(r => r.id) 
          : undefined,
        truncated: results.length >= query.limit,
      });
      
      return results;
    } catch (error) {
      await accessEvent.error(error);
      throw error;
    }
  }
  
  /**
   * Bulk export requires extra logging
   */
  async exportBulk(
    criteria: ExportCriteria,
    context: RequestContext
  ): Promise<ExportResult> {
    // Bulk operations always require enhanced logging
    const accessEvent = this.accessLogger.startAccess({
      actor: context.authenticatedUser,
      source: context.requestSource,
      target: {
        type: 'DATABASE_RECORD',
        collection: this.repository.tableName,
        classification: { level: 'CONFIDENTIAL', categories: ['BULK_EXPORT'] },
      },
      access: {
        operation: 'EXPORT',
        channel: context.channel,
        justification: {
          required: true,
          reason: context.exportJustification,
          approver: context.exportApprover,
        },
      },
    });
    
    const result = await this.repository.exportBulk(criteria);
    
    await accessEvent.complete({
      status: 'SUCCESS',
      recordsReturned: result.recordCount,
      bytesReturned: result.sizeBytes,
    });
    
    // Additional alert for bulk export
    await this.alertService.notify('BULK_EXPORT', {
      user: context.authenticatedUser,
      recordCount: result.recordCount,
      destination: result.destination,
    });
    
    return result;
  }
}

Database-Level Access Logging

Application-layer logging captures intended access, but database-layer logging captures actual access. Both are essential—they provide defense in depth and catch access that bypasses application controls (direct database connections, administrative queries).

Database Audit Capabilities

•PostgreSQL pgaudit — Extension logging SELECT, DML, DDL at configurable granularity. Can log by role, by database, by object. Supports session and object auditing.
•MySQL Enterprise Audit — Plugin providing query-level logging with JSON output. Configurable policies for which queries to audit.
•AWS RDS/Aurora Activity Streams — Real-time stream of database activity to Kinesis. Enables near-real-time monitoring without impacting database performance.
•MongoDB Audit Log — Built-in auditing for all operations. Configurable filters by user, role, operation type, and namespace.
•Azure SQL Auditing — Native auditing to Azure storage or Log Analytics with extended events.
•S3 Server Access Logging — Detailed logs of every request to S3 buckets, essential for file storage access tracking.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Enable pgaudit extension
CREATE EXTENSION IF NOT EXISTS pgaudit;
 
-- Configure audit logging for sensitive tables
-- Log all DML on customer tables
ALTER TABLE customers SET (pgaudit.log = 'write, read');
 
-- Configure role-based auditing
-- Log everything for admin role
ALTER ROLE admin_role SET pgaudit.log = 'all';
 
-- Application role: log only reads on sensitive data
ALTER ROLE app_service_role SET pgaudit.log = 'read';
 
-- Set session-level logging parameters
-- Log all DDL and DML statements
SET pgaudit.log = 'ddl, write';
 
-- Log the actual parameter values (careful: PII exposure!)
-- Usually disabled for sensitive data
SET pgaudit.log_parameter = off;
 
-- Include system catalog queries
SET pgaudit.log_catalog = on;
 
-- Log statement outcomes
SET pgaudit.log_level = log;
 
-- Example output in postgresql logs:
-- AUDIT: SESSION,1,1,READ,SELECT,TABLE,public.customers,
--        "SELECT id, email FROM customers WHERE id = $1"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
import { KinesisClient, GetRecordsCommand } from '@aws-sdk/client-kinesis';
import { KMSClient, DecryptCommand } from '@aws-sdk/client-kms';
 
interface DatabaseActivityEvent {
  type: 'DatabaseActivityMonitoringRecords';
  version: '1.1';
  databaseActivityEvents: {
    logTime: string;
    statementId: number;
    substatementId: number;
    objectType: 'TABLE' | 'FUNCTION' | 'INDEX';
    command: 'SELECT' | 'INSERT' | 'UPDATE' | 'DELETE';
    objectName: string;
    databaseName: string;
    dbUserName: string;
    remoteHost: string;
    remotePort: number;
    sessionId: string;
    rowCount: number;
    commandText: string;  // Full SQL (be careful!)
    paramList: string[];
    errorMessage?: string;
  }[];
}
 
class RDSActivityMonitor {
  private kinesis: KinesisClient;
  private kms: KMSClient;
  private accessLogger: AccessLogIngester;
  
  async processActivityStream(shardIterator: string): Promise<void> {
    while (true) {
      const response = await this.kinesis.send(new GetRecordsCommand({
        ShardIterator: shardIterator,
        Limit: 1000,
      }));
      
      for (const record of response.Records || []) {
        // Activity stream records are encrypted
        const decrypted = await this.decryptRecord(record.Data);
        const events = JSON.parse(decrypted) as DatabaseActivityEvent;
        
        for (const dbEvent of events.databaseActivityEvents) {
          await this.transformAndIngest(dbEvent);
        }
      }
      
      shardIterator = response.NextShardIterator;
      await sleep(100); // Rate control
    }
  }
  
  private async transformAndIngest(event: any): Promise<void> {
    // Transform RDS format to standard access log format
    const accessLog: AccessLogEvent = {
      id: generateUUID(),
      timestamp: event.logTime,
      
      actor: {
        type: 'SERVICE',
        id: event.dbUserName,
        authSession: {
          sessionId: event.sessionId,
          authMethod: 'DATABASE_AUTH',
        },
      },
      
      source: {
        ip: event.remoteHost,
        port: event.remotePort,
      },
      
      target: {
        type: 'DATABASE_RECORD',
        id: event.objectName,
        collection: `${event.databaseName}.${event.objectName}`,
      },
      
      access: {
        operation: this.mapCommand(event.command),
        channel: 'DATABASE_DIRECT',
        query: {
          type: event.command,
          sanitizedQuery: this.sanitizeQuery(event.commandText),
          executionTimeMs: event.latency,
        },
      },
      
      result: {
        status: event.errorMessage ? 'ERROR' : 'SUCCESS',
        recordsReturned: event.rowCount,
      },
    };
    
    await this.accessLogger.ingest(accessLog);
  }
  
  private sanitizeQuery(query: string): string {
    // Remove literal values to prevent PII logging
    return query
      .replace(/'[^']*'/g, '?')           // Remove string literals
      .replace(/\d+/g, '?')                // Remove numeric literals
      .replace(/\s+/g, ' ')                // Normalize whitespace
      .trim();
  }
}

API Access Logging

APIs are the primary interface for modern applications. Comprehensive API access logging captures who calls which endpoints, with what parameters, and what data is returned—without logging sensitive values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
import { Request, Response, NextFunction } from 'express';
import { AccessLogger, AccessLogEvent } from '@company/audit';
 
interface AccessLoggingConfig {
  // Fields to exclude from logging (e.g., password, token)
  sensitiveParams: string[];
  sensitiveHeaders: string[];
  
  // Endpoints with special handling
  excludedPaths: string[];        // Don't log these (health checks)
  bulkOperationPaths: string[];   // Enhanced logging for these
  
  // Response body logging (careful!)
  logResponseBody: boolean;       // Usually false
  responseBodyMaxSize: number;    // Truncate if enabled
}
 
function createAccessLoggingMiddleware(
  accessLogger: AccessLogger,
  config: AccessLoggingConfig
): RequestHandler {
  
  return async (req: Request, res: Response, next: NextFunction) => {
    // Skip excluded paths
    if (config.excludedPaths.some(p => req.path.startsWith(p))) {
      return next();
    }
    
    const startTime = Date.now();
    const requestId = req.headers['x-request-id'] as string || generateUUID();
    
    // Capture original write to intercept response
    const originalWrite = res.write;
    const originalEnd = res.end;
    const chunks: Buffer[] = [];
    
    if (config.logResponseBody) {
      res.write = function(chunk: any, ...args: any[]): boolean {
        chunks.push(Buffer.from(chunk));
        return originalWrite.apply(res, [chunk, ...args]);
      };
      
      res.end = function(chunk: any, ...args: any[]): Response {
        if (chunk) chunks.push(Buffer.from(chunk));
        return originalEnd.apply(res, [chunk, ...args]);
      };
    }
    
    // Create access log event
    res.on('finish', async () => {
      const duration = Date.now() - startTime;
      
      const accessEvent: AccessLogEvent = {
        id: generateUUID(),
        timestamp: new Date().toISOString(),
        
        actor: extractActor(req),
        source: extractSource(req),
        
        target: {
          type: 'API_RESOURCE',
          id: req.path,
          collection: extractResourceCollection(req.path),
        },
        
        access: {
          operation: mapHttpMethodToOperation(req.method),
          channel: 'API',
          endpoint: {
            method: req.method,
            path: req.path,
            queryParams: sanitizeParams(req.query, config.sensitiveParams),
          },
        },
        
        result: {
          status: res.statusCode < 400 ? 'SUCCESS' : 
                  res.statusCode === 403 ? 'DENIED' : 'ERROR',
          durationMs: duration,
          bytesReturned: parseInt(res.get('content-length') || '0'),
          // Extract record count from response if available
          recordsReturned: extractRecordCount(res),
        },
        
        context: {
          requestId,
          traceId: req.headers['x-trace-id'] as string,
          sessionId: req.session?.id,
          application: 'api-gateway',
          environment: process.env.NODE_ENV || 'development',
          version: process.env.APP_VERSION || 'unknown',
        },
      };
      
      // Enhanced logging for bulk operations
      if (config.bulkOperationPaths.some(p => req.path.startsWith(p))) {
        accessEvent.access.operation = 'EXPORT';
        accessEvent.target.classification = {
          level: 'CONFIDENTIAL',
          categories: ['BULK_OPERATION'],
        };
      }
      
      await accessLogger.log(accessEvent);
    });
    
    next();
  };
}
 
// Helper functions
function extractActor(req: Request): AccessLogEvent['actor'] {
  const user = req.user; // From authentication middleware
  
  return {
    type: user?.serviceAccount ? 'SERVICE' : 'USER',
    id: user?.id || 'anonymous',
    displayName: user?.name,
    email: user?.email,
    authSession: {
      sessionId: req.session?.id || 'no-session',
      authMethod: req.authInfo?.method || 'UNKNOWN',
      authTime: req.authInfo?.authTime,
      mfaVerified: req.authInfo?.mfaVerified || false,
    },
    roles: user?.roles || [],
  };
}
 
function sanitizeParams(
  params: object,
  sensitiveFields: string[]
): object {
  const sanitized = { ...params };
  for (const field of sensitiveFields) {
    if (field in sanitized) {
      sanitized[field] = '[REDACTED]';
    }
  }
  return sanitized;
}

Access Log Correlation and Analysis

Individual access logs are data points; correlated access logs tell stories. Effective forensics requires connecting logs across layers, services, and time to reconstruct complete access patterns.

Correlation Keys

•Request ID — Unique identifier propagated through all services handling a single user request. Essential for tracing one action through microservices.
•Session ID — Links all actions by a user within a session. Key for investigating "what did this user do during this session?"
•User ID — Stable identifier across sessions. Required for answering "what has this user ever accessed?"
•Trace ID — Distributed tracing identifier (e.g., W3C Trace Context). Links logs to APM/distributed tracing systems.
•Resource ID — Target resource identifier. Enables "who has accessed this record?" queries.
•Time Window — Temporal correlation for events lacking explicit links. "What else happened in this 5-second window?"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
interface AccessInvestigationService {
  /**
   * What did a user access during breach window?
   */
  async getUserAccessHistory(
    userId: string,
    startTime: Date,
    endTime: Date
  ): Promise<AccessSummary>;
  
  /**
   * Who accessed a specific record?
   */
  async getRecordAccessHistory(
    recordId: string,
    options?: { limit?: number; includeServiceAccess?: boolean }
  ): Promise<AccessorList>;
  
  /**
   * Reconstruct a complete request journey
   */
  async traceRequest(requestId: string): Promise<RequestJourney>;
}
 
// Example: OpenSearch/Elasticsearch queries for investigation
 
class AccessInvestigator {
  private esClient: ElasticsearchClient;
  
  /**
   * Breach scope assessment: What sensitive data did this user access?
   */
  async assessBreachScope(
    compromisedUserId: string,
    breachStart: Date,
    breachEnd: Date
  ): Promise<BreachScopeReport> {
    
    const response = await this.esClient.search({
      index: 'access-logs-*',
      body: {
        size: 0,
        query: {
          bool: {
            must: [
              { term: { 'actor.id': compromisedUserId } },
              { range: { timestamp: { gte: breachStart, lte: breachEnd } } },
              { terms: { 'target.classification.categories': ['PII', 'PHI', 'FINANCIAL'] } },
              { term: { 'result.status': 'SUCCESS' } },
            ],
          },
        },
        aggs: {
          by_classification: {
            terms: { field: 'target.classification.categories' },
          },
          by_resource: {
            terms: { field: 'target.collection', size: 100 },
            aggs: {
              unique_records: {
                cardinality: { field: 'target.id' },
              },
              sample_ids: {
                terms: { field: 'target.id', size: 10 },
              },
            },
          },
          by_operation: {
            terms: { field: 'access.operation' },
          },
          unique_records_accessed: {
            cardinality: { field: 'target.id' },
          },
          access_timeline: {
            date_histogram: {
              field: 'timestamp',
              calendar_interval: 'hour',
            },
          },
        },
      },
    });
    
    return this.formatBreachReport(response.aggregations);
  }
  
  /**
   * Detect anomalous access patterns
   */
  async detectAnomalies(options: AnomalyDetectionOptions): Promise<Anomaly[]> {
    // Volume-based: unusual access count
    const volumeAnomalies = await this.detectVolumeAnomalies(options);
    
    // Time-based: access outside normal hours
    const timeAnomalies = await this.detectTimeAnomalies(options);
    
    // Geography-based: access from unusual locations
    const geoAnomalies = await this.detectGeoAnomalies(options);
    
    // Behavior-based: unusual resources accessed
    const behaviorAnomalies = await this.detectBehaviorAnomalies(options);
    
    return [...volumeAnomalies, ...timeAnomalies, ...geoAnomalies, ...behaviorAnomalies];
  }
}

Summary: Access Logging

Access logging is the heartbeat of security observability. Without it, breaches are invisible, forensics are impossible, and compliance is unverifiable. With comprehensive access logging, organizations can answer the most critical security question: 'What data was accessed?'

Key Takeaways

•Access logging answers the five W's — Who accessed what, when, where (from), and why. This is distinct from authentication and authorization logging.
•Scope by sensitivity — Not all access needs logging. Focus on sensitive data classes, privileged operations, and bulk access. Exclude noise.
•Log what, never values — Log resource identifiers and field names, never the actual data values. Never log passwords, tokens, or PII.
•Multi-layer logging provides defense in depth — Application, API gateway, and database layers each capture different perspectives on access.
•Correlation keys enable investigation — Request IDs, session IDs, and user IDs link logs across systems for complete access reconstruction.
•Access logs are evidence — They must meet the same integrity and retention standards as other audit logs.

What's Next

With access logging in place, we complete the module with compliance reporting—transforming raw audit and access data into the reports, dashboards, and attestations that auditors require. You'll learn how to automate evidence generation, demonstrate continuous compliance, and respond effectively to audit requests.

Page Complete

You now understand how to implement comprehensive access logging across application layers. From schema design through multi-layer implementation to correlation and analysis, you can build systems that answer the critical forensic question: 'What did they access?'

4 / 5

Loading learning content...

System Design (HLD)Audit and Logging for Compliance

Audit and Logging for Compliance

LevelAdvanced

Duration60 mins

TopicAudit and Logging for Compliance

4 / 5

Access Logging

The Question That Matters Most

When a breach is discovered—and breaches are discovered, always—the first question isn't 'How did they get in?' It's 'What did they access?'

What You Will Learn

Defining Access Logging

The Access Questions

Comprehensive access logging answers the fundamental security questions:

Who: Which identity performed the access? Human user? Service account? API key?
What: What resource was accessed? Specific records? Files? API endpoints?
When: Precise timestamp with sub-second granularity
Where: From what location? IP address, device, geographic region
How: Through what mechanism? Web UI, API, direct database, export?
Why: What was the business context? Request ID, session context, stated purpose
Result: Was access granted or denied? Full or partial? What was returned?

Access Logging vs. Related Logging Types
Log Type	Focus	Answers	Example
Authentication Logs	Identity verification	Did user prove identity?	'User john@example.com logged in via SSO'
Authorization Logs	Permission decisions	Was access allowed?	'User granted READ on /api/users/*'
Access Logs	Data interaction	What data was touched?	'User read records 1-50 from Customer table'
Activity Logs	Business operations	What actions occurred?	'User approved purchase order #12345'
Audit Logs	Complete trail	Full compliance record	'Comprehensive capture of all above'

Access Logging is Granular

Defining Access Log Scope

Always Log These Accesses

•All access to PII (names, SSN, addresses)
•All access to PHI (medical records, diagnoses)
•All access to financial data (accounts, transactions)
•All access to credentials or secrets
•All access to source code and IP
•All bulk data operations (exports, reports)
•All administrative/privileged access
•All API access from external parties
•All access denied events
•All access to audit logs themselves

Typically Excluded

•Public, read-only content
•Static assets (CSS, images)
•Health check endpoints
•Caching layer hits
•Internal service-to-service heartbeats
•Anonymous aggregate metrics
•Very high-volume, low-value reads
•Development/test environment access
•Automated monitoring queries
•CDN and edge cache operations

Granularity Spectrum

Access logging granularity exists on a spectrum with cost/performance implications:

Level	Description	Use Case	Cost
Endpoint	API route accessed	Basic compliance	Low
Resource	Specific object/record ID	Breach scope	Medium
Field	Individual data fields accessed	Healthcare, financial	High
Value	Actual data values returned	Rarely needed, high risk	Very High

Most organizations use resource-level logging for sensitive data and endpoint-level for general access. Field-level is required for HIPAA-critical PHI access scenarios.

Access Log Event Schema

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
interface AccessLogEvent {
  // ========== Core Identification ==========
  id: string;                    // UUID v7 (time-ordered)
  timestamp: string;             // ISO 8601 with microseconds
  
  // ========== Actor (Who) ==========
  actor: {
    // Primary identifier - must be immutable
    type: 'USER' | 'SERVICE' | 'SYSTEM' | 'API_KEY';
    id: string;                  // Permanent, unique identifier
    
    // Human-readable attributes (may change over time)
    displayName?: string;        // e.g., "John Smith"
    email?: string;              // e.g., "john@company.com"
    
    // Authentication context
    authSession: {
      sessionId: string;         // Links to authentication event
      authMethod: string;        // 'SSO_SAML' | 'MFA_TOTP' | 'API_KEY'
      authTime: string;          // When authentication occurred
      mfaVerified: boolean;      // Whether MFA was used
    };
    
    // Role/permissions at access time
    roles: string[];             // ['admin', 'hr_viewer']
    effectivePermissions?: string[]; // Computed permissions used
  };
  
  // ========== Source (Where From) ==========
  source: {
    ip: string;                  // Client IP (consider proxies)
    originalIp?: string;         // X-Forwarded-For if applicable
    userAgent: string;           // Browser/client identification
    
    // Device identification (if available)
    device?: {
      id: string;                // Device fingerprint or registered ID
      type: 'DESKTOP' | 'MOBILE' | 'TABLET' | 'API_CLIENT';
      trusted: boolean;          // Is this a known/trusted device
    };
    
    // Geolocation (privacy considerations apply)
    geo?: {
      country: string;           // ISO country code
      region?: string;           // State/province
      city?: string;             // Only if allowed by policy
      coordinates?: {            // Rarely needed
        lat: number;
        lon: number;
        accuracy: number;
      };
    };
  };
  
  // ========== Target (What Was Accessed) ==========
  target: {
    type: 'DATABASE_RECORD' | 'FILE' | 'API_RESOURCE' | 
          'CONFIGURATION' | 'SECRET' | 'REPORT';
    
    // Resource identification
    id: string;                  // Unique resource identifier
    collection: string;          // Table, bucket, or path
    
    // For database access - record details
    records?: {
      count: number;             // Number of records accessed
      ids?: string[];            // Specific IDs (if reasonable count)
      range?: {                  // For range queries
        field: string;
        start: string;
        end: string;
      };
    };
    
    // Data classification
    classification: {
      level: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL' | 'RESTRICTED';
      categories: string[];      // ['PII', 'PHI', 'FINANCIAL']
    };
    
    // Fields accessed (for field-level logging)
    fields?: {
      requested: string[];       // Fields in the query
      returned: string[];        // Fields actually returned
      filtered: string[];        // Fields redacted by policy
    };
  };
  
  // ========== Access Details (How and What) ==========
  access: {
    operation: 'READ' | 'CREATE' | 'UPDATE' | 'DELETE' | 
               'EXPORT' | 'SHARE' | 'DOWNLOAD' | 'PRINT';
    
    // How the access occurred
    channel: 'WEB_UI' | 'API' | 'MOBILE_APP' | 'CLI' | 
             'DATABASE_DIRECT' | 'BATCH_JOB' | 'REPORT';
    
    // API/endpoint details
    endpoint?: {
      method: string;            // GET, POST, etc.
      path: string;              // /api/v1/customers
      queryParams?: object;      // Non-sensitive params
    };
    
    // Query details (for database access)
    query?: {
      type: 'SELECT' | 'INSERT' | 'UPDATE' | 'DELETE';
      sanitizedQuery?: string;   // Query with values removed
      executionTimeMs: number;
    };
    
    // Business justification (if captured)
    justification?: {
      required: boolean;         // Was break-glass required
      reason?: string;           // User-provided reason
      approver?: string;         // Who approved (if workflow)
    };
  };
  
  // ========== Result (What Happened) ==========
  result: {
    status: 'SUCCESS' | 'DENIED' | 'PARTIAL' | 'ERROR';
    
    // For denials
    denialReason?: string;       // 'INSUFFICIENT_PERMISSION' | 'RATE_LIMITED'
    policyViolated?: string;     // Which policy triggered denial
    
    // For successful access
    recordsReturned?: number;    // How many records
    bytesReturned?: number;      // Data volume
    truncated?: boolean;         // Was result set limited
    
    // Response time
    durationMs: number;
  };
  
  // ========== Context (Why) ==========
  context: {
    requestId: string;           // Correlation ID
    traceId?: string;            // Distributed tracing
    sessionId: string;           // User session
    
    // Application context
    application: string;         // Which app generated this
    environment: string;         // production, staging
    version: string;             // App version
    
    // Business context
    workflow?: string;           // Business process involved
    ticketId?: string;           // Support ticket triggering access
  };
}

Never Log the Data Values

Access Logging Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
┌─────────────────────────────────────────────────────────────────────────────────┐
│                               USER / CLIENT                                      │
└───────────────────────────────────┬──────────────────────────────────────────────┘
                                    │ Request with Auth Token
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                            API GATEWAY LAYER                                     │
│  ┌───────────────────────────────────────────────────────────────────────────┐  │
│  │                     Kong / AWS API Gateway / Envoy                         │  │
│  │                                                                            │  │
│  │  ACCESS LOG #1: Gateway Access Log                                         │  │
│  │  • Client IP, User-Agent                                                   │  │
│  │  • Request path, method, headers                                           │  │
│  │  • Authentication result (token validated)                                 │  │
│  │  • Response code, latency                                                  │  │
│  │  • Rate limiting decisions                                                 │  │
│  └───────────────────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────┬──────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                            APPLICATION LAYER                                     │
│  ┌───────────────────────────────────────────────────────────────────────────┐  │
│  │                     Node.js / Python / Java Service                        │  │
│  │                                                                            │  │
│  │  ACCESS LOG #2: Application Access Log                                     │  │
│  │  • User identity (from token claims)                                       │  │
│  │  • Business operation being performed                                      │  │
│  │  • Authorization decision (permitted?)                                     │  │
│  │  • Which resources/record IDs accessed                                     │  │
│  │  • Data classification of accessed resources                               │  │
│  │  • Business context (workflow, ticket, reason)                             │  │
│  │                                                                            │  │
│  │  ┌────────────────────────────────────────────────────────────────────┐   │  │
│  │  │  Access Logging Middleware / Interceptor                            │   │  │
│  │  │                                                                     │   │  │
│  │  │  • Wraps all data access operations                                │   │  │
│  │  │  • Captures pre/post execution context                             │   │  │
│  │  │  • Handles async/batch operations                                  │   │  │
│  │  │  • Standardized format across services                             │   │  │
│  │  └────────────────────────────────────────────────────────────────────┘   │  │
│  └───────────────────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────┬──────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                            DATABASE LAYER                                        │
│  ┌───────────────────────────────────────────────────────────────────────────┐  │
│  │                PostgreSQL / MySQL / MongoDB / S3                           │  │
│  │                                                                            │  │
│  │  ACCESS LOG #3: Database Audit Log                                         │  │
│  │  • Actual SQL/queries executed                                             │  │
│  │  • Tables and columns accessed                                             │  │
│  │  • Rows affected/returned                                                  │  │
│  │  • Connection identity                                                     │  │
│  │  • Query execution time                                                    │  │
│  │                                                                            │  │
│  │  PostgreSQL: pgaudit extension                                             │  │
│  │  MySQL: Enterprise Audit                                                   │  │
│  │  MongoDB: $auditLog                                                        │  │
│  │  S3: Server Access Logging                                                 │  │
│  └───────────────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────────┘

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
import { AccessLogger } from '@company/access-logging';
 
/**
 * Data access wrapper that ensures all access is logged
 */
class AuditedDataAccess<T> {
  private accessLogger: AccessLogger;
  private repository: Repository<T>;
  
  constructor(
    repository: Repository<T>,
    config: {
      resourceType: string;
      classification: DataClassification;
    }
  ) {
    this.accessLogger = new AccessLogger(config);
    this.repository = repository;
  }
  
  async findById(
    id: string,
    context: RequestContext
  ): Promise<T | null> {
    const accessEvent = this.accessLogger.startAccess({
      actor: context.authenticatedUser,
      source: context.requestSource,
      target: {
        type: 'DATABASE_RECORD',
        id: id,
        collection: this.repository.tableName,
      },
      access: {
        operation: 'READ',
        channel: context.channel,
      },
    });
    
    try {
      const result = await this.repository.findById(id);
      
      // Log successful access
      await accessEvent.complete({
        status: result ? 'SUCCESS' : 'SUCCESS',
        recordsReturned: result ? 1 : 0,
      });
      
      return result;
    } catch (error) {
      // Log failed access
      await accessEvent.error(error);
      throw error;
    }
  }
  
  async findByQuery(
    query: QuerySpec,
    context: RequestContext
  ): Promise<T[]> {
    const accessEvent = this.accessLogger.startAccess({
      actor: context.authenticatedUser,
      source: context.requestSource,
      target: {
        type: 'DATABASE_RECORD',
        collection: this.repository.tableName,
        // Log query parameters, not values
        records: {
          range: query.getRange(),
        },
      },
      access: {
        operation: 'READ',
        channel: context.channel,
        query: {
          type: 'SELECT',
          sanitizedQuery: query.toSanitizedString(), // No values!
        },
      },
    });
    
    try {
      const results = await this.repository.query(query);
      
      await accessEvent.complete({
        status: 'SUCCESS',
        recordsReturned: results.length,
        // Log IDs if reasonable count
        recordIds: results.length <= 100 
          ? results.map(r => r.id) 
          : undefined,
        truncated: results.length >= query.limit,
      });
      
      return results;
    } catch (error) {
      await accessEvent.error(error);
      throw error;
    }
  }
  
  /**
   * Bulk export requires extra logging
   */
  async exportBulk(
    criteria: ExportCriteria,
    context: RequestContext
  ): Promise<ExportResult> {
    // Bulk operations always require enhanced logging
    const accessEvent = this.accessLogger.startAccess({
      actor: context.authenticatedUser,
      source: context.requestSource,
      target: {
        type: 'DATABASE_RECORD',
        collection: this.repository.tableName,
        classification: { level: 'CONFIDENTIAL', categories: ['BULK_EXPORT'] },
      },
      access: {
        operation: 'EXPORT',
        channel: context.channel,
        justification: {
          required: true,
          reason: context.exportJustification,
          approver: context.exportApprover,
        },
      },
    });
    
    const result = await this.repository.exportBulk(criteria);
    
    await accessEvent.complete({
      status: 'SUCCESS',
      recordsReturned: result.recordCount,
      bytesReturned: result.sizeBytes,
    });
    
    // Additional alert for bulk export
    await this.alertService.notify('BULK_EXPORT', {
      user: context.authenticatedUser,
      recordCount: result.recordCount,
      destination: result.destination,
    });
    
    return result;
  }
}

Database-Level Access Logging

Database Audit Capabilities

•PostgreSQL pgaudit — Extension logging SELECT, DML, DDL at configurable granularity. Can log by role, by database, by object. Supports session and object auditing.
•MySQL Enterprise Audit — Plugin providing query-level logging with JSON output. Configurable policies for which queries to audit.
•AWS RDS/Aurora Activity Streams — Real-time stream of database activity to Kinesis. Enables near-real-time monitoring without impacting database performance.
•MongoDB Audit Log — Built-in auditing for all operations. Configurable filters by user, role, operation type, and namespace.
•Azure SQL Auditing — Native auditing to Azure storage or Log Analytics with extended events.
•S3 Server Access Logging — Detailed logs of every request to S3 buckets, essential for file storage access tracking.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Enable pgaudit extension
CREATE EXTENSION IF NOT EXISTS pgaudit;
 
-- Configure audit logging for sensitive tables
-- Log all DML on customer tables
ALTER TABLE customers SET (pgaudit.log = 'write, read');
 
-- Configure role-based auditing
-- Log everything for admin role
ALTER ROLE admin_role SET pgaudit.log = 'all';
 
-- Application role: log only reads on sensitive data
ALTER ROLE app_service_role SET pgaudit.log = 'read';
 
-- Set session-level logging parameters
-- Log all DDL and DML statements
SET pgaudit.log = 'ddl, write';
 
-- Log the actual parameter values (careful: PII exposure!)
-- Usually disabled for sensitive data
SET pgaudit.log_parameter = off;
 
-- Include system catalog queries
SET pgaudit.log_catalog = on;
 
-- Log statement outcomes
SET pgaudit.log_level = log;
 
-- Example output in postgresql logs:
-- AUDIT: SESSION,1,1,READ,SELECT,TABLE,public.customers,
--        "SELECT id, email FROM customers WHERE id = $1"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
import { KinesisClient, GetRecordsCommand } from '@aws-sdk/client-kinesis';
import { KMSClient, DecryptCommand } from '@aws-sdk/client-kms';
 
interface DatabaseActivityEvent {
  type: 'DatabaseActivityMonitoringRecords';
  version: '1.1';
  databaseActivityEvents: {
    logTime: string;
    statementId: number;
    substatementId: number;
    objectType: 'TABLE' | 'FUNCTION' | 'INDEX';
    command: 'SELECT' | 'INSERT' | 'UPDATE' | 'DELETE';
    objectName: string;
    databaseName: string;
    dbUserName: string;
    remoteHost: string;
    remotePort: number;
    sessionId: string;
    rowCount: number;
    commandText: string;  // Full SQL (be careful!)
    paramList: string[];
    errorMessage?: string;
  }[];
}
 
class RDSActivityMonitor {
  private kinesis: KinesisClient;
  private kms: KMSClient;
  private accessLogger: AccessLogIngester;
  
  async processActivityStream(shardIterator: string): Promise<void> {
    while (true) {
      const response = await this.kinesis.send(new GetRecordsCommand({
        ShardIterator: shardIterator,
        Limit: 1000,
      }));
      
      for (const record of response.Records || []) {
        // Activity stream records are encrypted
        const decrypted = await this.decryptRecord(record.Data);
        const events = JSON.parse(decrypted) as DatabaseActivityEvent;
        
        for (const dbEvent of events.databaseActivityEvents) {
          await this.transformAndIngest(dbEvent);
        }
      }
      
      shardIterator = response.NextShardIterator;
      await sleep(100); // Rate control
    }
  }
  
  private async transformAndIngest(event: any): Promise<void> {
    // Transform RDS format to standard access log format
    const accessLog: AccessLogEvent = {
      id: generateUUID(),
      timestamp: event.logTime,
      
      actor: {
        type: 'SERVICE',
        id: event.dbUserName,
        authSession: {
          sessionId: event.sessionId,
          authMethod: 'DATABASE_AUTH',
        },
      },
      
      source: {
        ip: event.remoteHost,
        port: event.remotePort,
      },
      
      target: {
        type: 'DATABASE_RECORD',
        id: event.objectName,
        collection: `${event.databaseName}.${event.objectName}`,
      },
      
      access: {
        operation: this.mapCommand(event.command),
        channel: 'DATABASE_DIRECT',
        query: {
          type: event.command,
          sanitizedQuery: this.sanitizeQuery(event.commandText),
          executionTimeMs: event.latency,
        },
      },
      
      result: {
        status: event.errorMessage ? 'ERROR' : 'SUCCESS',
        recordsReturned: event.rowCount,
      },
    };
    
    await this.accessLogger.ingest(accessLog);
  }
  
  private sanitizeQuery(query: string): string {
    // Remove literal values to prevent PII logging
    return query
      .replace(/'[^']*'/g, '?')           // Remove string literals
      .replace(/\d+/g, '?')                // Remove numeric literals
      .replace(/\s+/g, ' ')                // Normalize whitespace
      .trim();
  }
}

API Access Logging

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
import { Request, Response, NextFunction } from 'express';
import { AccessLogger, AccessLogEvent } from '@company/audit';
 
interface AccessLoggingConfig {
  // Fields to exclude from logging (e.g., password, token)
  sensitiveParams: string[];
  sensitiveHeaders: string[];
  
  // Endpoints with special handling
  excludedPaths: string[];        // Don't log these (health checks)
  bulkOperationPaths: string[];   // Enhanced logging for these
  
  // Response body logging (careful!)
  logResponseBody: boolean;       // Usually false
  responseBodyMaxSize: number;    // Truncate if enabled
}
 
function createAccessLoggingMiddleware(
  accessLogger: AccessLogger,
  config: AccessLoggingConfig
): RequestHandler {
  
  return async (req: Request, res: Response, next: NextFunction) => {
    // Skip excluded paths
    if (config.excludedPaths.some(p => req.path.startsWith(p))) {
      return next();
    }
    
    const startTime = Date.now();
    const requestId = req.headers['x-request-id'] as string || generateUUID();
    
    // Capture original write to intercept response
    const originalWrite = res.write;
    const originalEnd = res.end;
    const chunks: Buffer[] = [];
    
    if (config.logResponseBody) {
      res.write = function(chunk: any, ...args: any[]): boolean {
        chunks.push(Buffer.from(chunk));
        return originalWrite.apply(res, [chunk, ...args]);
      };
      
      res.end = function(chunk: any, ...args: any[]): Response {
        if (chunk) chunks.push(Buffer.from(chunk));
        return originalEnd.apply(res, [chunk, ...args]);
      };
    }
    
    // Create access log event
    res.on('finish', async () => {
      const duration = Date.now() - startTime;
      
      const accessEvent: AccessLogEvent = {
        id: generateUUID(),
        timestamp: new Date().toISOString(),
        
        actor: extractActor(req),
        source: extractSource(req),
        
        target: {
          type: 'API_RESOURCE',
          id: req.path,
          collection: extractResourceCollection(req.path),
        },
        
        access: {
          operation: mapHttpMethodToOperation(req.method),
          channel: 'API',
          endpoint: {
            method: req.method,
            path: req.path,
            queryParams: sanitizeParams(req.query, config.sensitiveParams),
          },
        },
        
        result: {
          status: res.statusCode < 400 ? 'SUCCESS' : 
                  res.statusCode === 403 ? 'DENIED' : 'ERROR',
          durationMs: duration,
          bytesReturned: parseInt(res.get('content-length') || '0'),
          // Extract record count from response if available
          recordsReturned: extractRecordCount(res),
        },
        
        context: {
          requestId,
          traceId: req.headers['x-trace-id'] as string,
          sessionId: req.session?.id,
          application: 'api-gateway',
          environment: process.env.NODE_ENV || 'development',
          version: process.env.APP_VERSION || 'unknown',
        },
      };
      
      // Enhanced logging for bulk operations
      if (config.bulkOperationPaths.some(p => req.path.startsWith(p))) {
        accessEvent.access.operation = 'EXPORT';
        accessEvent.target.classification = {
          level: 'CONFIDENTIAL',
          categories: ['BULK_OPERATION'],
        };
      }
      
      await accessLogger.log(accessEvent);
    });
    
    next();
  };
}
 
// Helper functions
function extractActor(req: Request): AccessLogEvent['actor'] {
  const user = req.user; // From authentication middleware
  
  return {
    type: user?.serviceAccount ? 'SERVICE' : 'USER',
    id: user?.id || 'anonymous',
    displayName: user?.name,
    email: user?.email,
    authSession: {
      sessionId: req.session?.id || 'no-session',
      authMethod: req.authInfo?.method || 'UNKNOWN',
      authTime: req.authInfo?.authTime,
      mfaVerified: req.authInfo?.mfaVerified || false,
    },
    roles: user?.roles || [],
  };
}
 
function sanitizeParams(
  params: object,
  sensitiveFields: string[]
): object {
  const sanitized = { ...params };
  for (const field of sensitiveFields) {
    if (field in sanitized) {
      sanitized[field] = '[REDACTED]';
    }
  }
  return sanitized;
}

Access Log Correlation and Analysis

Individual access logs are data points; correlated access logs tell stories. Effective forensics requires connecting logs across layers, services, and time to reconstruct complete access patterns.

Correlation Keys

•Request ID — Unique identifier propagated through all services handling a single user request. Essential for tracing one action through microservices.
•Session ID — Links all actions by a user within a session. Key for investigating "what did this user do during this session?"
•User ID — Stable identifier across sessions. Required for answering "what has this user ever accessed?"
•Trace ID — Distributed tracing identifier (e.g., W3C Trace Context). Links logs to APM/distributed tracing systems.
•Resource ID — Target resource identifier. Enables "who has accessed this record?" queries.
•Time Window — Temporal correlation for events lacking explicit links. "What else happened in this 5-second window?"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
interface AccessInvestigationService {
  /**
   * What did a user access during breach window?
   */
  async getUserAccessHistory(
    userId: string,
    startTime: Date,
    endTime: Date
  ): Promise<AccessSummary>;
  
  /**
   * Who accessed a specific record?
   */
  async getRecordAccessHistory(
    recordId: string,
    options?: { limit?: number; includeServiceAccess?: boolean }
  ): Promise<AccessorList>;
  
  /**
   * Reconstruct a complete request journey
   */
  async traceRequest(requestId: string): Promise<RequestJourney>;
}
 
// Example: OpenSearch/Elasticsearch queries for investigation
 
class AccessInvestigator {
  private esClient: ElasticsearchClient;
  
  /**
   * Breach scope assessment: What sensitive data did this user access?
   */
  async assessBreachScope(
    compromisedUserId: string,
    breachStart: Date,
    breachEnd: Date
  ): Promise<BreachScopeReport> {
    
    const response = await this.esClient.search({
      index: 'access-logs-*',
      body: {
        size: 0,
        query: {
          bool: {
            must: [
              { term: { 'actor.id': compromisedUserId } },
              { range: { timestamp: { gte: breachStart, lte: breachEnd } } },
              { terms: { 'target.classification.categories': ['PII', 'PHI', 'FINANCIAL'] } },
              { term: { 'result.status': 'SUCCESS' } },
            ],
          },
        },
        aggs: {
          by_classification: {
            terms: { field: 'target.classification.categories' },
          },
          by_resource: {
            terms: { field: 'target.collection', size: 100 },
            aggs: {
              unique_records: {
                cardinality: { field: 'target.id' },
              },
              sample_ids: {
                terms: { field: 'target.id', size: 10 },
              },
            },
          },
          by_operation: {
            terms: { field: 'access.operation' },
          },
          unique_records_accessed: {
            cardinality: { field: 'target.id' },
          },
          access_timeline: {
            date_histogram: {
              field: 'timestamp',
              calendar_interval: 'hour',
            },
          },
        },
      },
    });
    
    return this.formatBreachReport(response.aggregations);
  }
  
  /**
   * Detect anomalous access patterns
   */
  async detectAnomalies(options: AnomalyDetectionOptions): Promise<Anomaly[]> {
    // Volume-based: unusual access count
    const volumeAnomalies = await this.detectVolumeAnomalies(options);
    
    // Time-based: access outside normal hours
    const timeAnomalies = await this.detectTimeAnomalies(options);
    
    // Geography-based: access from unusual locations
    const geoAnomalies = await this.detectGeoAnomalies(options);
    
    // Behavior-based: unusual resources accessed
    const behaviorAnomalies = await this.detectBehaviorAnomalies(options);
    
    return [...volumeAnomalies, ...timeAnomalies, ...geoAnomalies, ...behaviorAnomalies];
  }
}

Summary: Access Logging

Key Takeaways

•Access logging answers the five W's — Who accessed what, when, where (from), and why. This is distinct from authentication and authorization logging.
•Scope by sensitivity — Not all access needs logging. Focus on sensitive data classes, privileged operations, and bulk access. Exclude noise.
•Log what, never values — Log resource identifiers and field names, never the actual data values. Never log passwords, tokens, or PII.
•Multi-layer logging provides defense in depth — Application, API gateway, and database layers each capture different perspectives on access.
•Correlation keys enable investigation — Request IDs, session IDs, and user IDs link logs across systems for complete access reconstruction.
•Access logs are evidence — They must meet the same integrity and retention standards as other audit logs.

What's Next

Page Complete

4 / 5