Encryption at Rest - Learning Module

Loading content...

0/273

Database Encryption

Protecting Your Most Valuable Asset

Databases are the crown jewels of most organizations. They contain customer records, financial transactions, intellectual property, healthcare data, and the operational state of business-critical systems. When attackers breach systems, databases are typically the ultimate target—the place where the most valuable data resides in a conveniently queryable format.

The database encryption challenge is unique:

Unlike file systems or block storage, databases have sophisticated internal structures. Indexes enable fast queries. Relationships link tables together. Query optimizers plan execution paths. Transaction logs ensure durability. Encryption must protect data while preserving these capabilities—a far more complex challenge than simply encrypting files on disk.

This page explores the spectrum of database encryption approaches, from the simplest transparent solutions to granular field-level protection, helping you understand which approach fits your specific requirements.

What You Will Learn

By the end of this page, you will understand the full range of database encryption options—Transparent Data Encryption (TDE), column-level encryption, tablespace encryption, and client-side encryption. You'll learn how each approach works internally, their security guarantees, performance implications, and how to choose the right approach for your compliance and security requirements.

The Database Encryption Landscape

Database encryption can be implemented at multiple granularities in the data stack. Each approach has distinct characteristics regarding performance, security guarantees, and operational complexity.

The four primary approaches:

Database Encryption Approaches
Approach	Description	Encrypted Layer	Who Controls Keys
Transparent Data Encryption (TDE)	Database encrypts data files transparently	Data files, logs, backups	Database administrator or KMS
Column/Cell-Level Encryption	Specific columns are encrypted individually	Selected columns within tables	Application or database
Tablespace Encryption	Entire tablespaces (groups of tables) encrypted	Tablespace data files	Database administrator
Client-Side Encryption	Application encrypts before sending to database	Selected fields or records	Application

Understanding the tradeoffs:

The fundamental tradeoff is between transparency and protection depth:

More transparent solutions (TDE, tablespace encryption) are easier to implement and require no application changes, but they primarily protect against storage-level threats—the database server process has access to unencrypted data.
More granular solutions (column-level, client-side) require more implementation effort but can protect data even from database administrators and internal threats.

Most enterprise deployments combine approaches: TDE as a baseline for compliance and defense against physical threats, plus column-level or client-side encryption for the most sensitive data like social security numbers, credit cards, or protected health information.

The Queryability Problem

A fundamental challenge of database encryption is that encrypted data cannot be efficiently queried. You cannot create an index on encrypted values (they appear random), you cannot perform range queries, and you cannot use SQL functions on encrypted columns. Advanced techniques like searchable encryption and order-preserving encryption attempt to bridge this gap, but with significant tradeoffs. We'll explore these later in the page.

Transparent Data Encryption (TDE) Deep Dive

Transparent Data Encryption (TDE) is the most widely deployed database encryption approach. It encrypts data files, transaction logs, and backups at the database engine level, without requiring any application changes. The 'transparent' name comes from the fact that applications continue to issue queries exactly as before—encryption and decryption happen automatically within the database engine.

How TDE works internally:

The database maintains a Data Encryption Key (DEK) that encrypts and decrypts data pages
The DEK is itself encrypted by a Key Encryption Key (KEK), also called the master key
The encrypted DEK is stored in the database header or system tables
When the database starts, the KEK is used to decrypt the DEK, which is then held in memory
As data pages are read from disk, they are decrypted in memory using the DEK
As data pages are written to disk, they are encrypted using the DEK
Applications work with unencrypted data in memory—they never see encrypted content

TDE Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Step 1: Create a database master key (KEK)
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'StrongPassword123!';
 
-- Step 2: Create a certificate to protect the DEK
CREATE CERTIFICATE TDE_Certificate WITH SUBJECT = 'TDE Certificate';
 
-- Step 3: Create the database encryption key (DEK)
USE MyDatabase;
CREATE DATABASE ENCRYPTION KEY
    WITH ALGORITHM = AES_256
    ENCRYPTION BY SERVER CERTIFICATE TDE_Certificate;
 
-- Step 4: Enable TDE on the database
ALTER DATABASE MyDatabase SET ENCRYPTION ON;
 
-- Verify encryption status
SELECT 
    db.name,
    db.is_encrypted,
    dm.encryption_state,
    dm.percent_complete,
    dm.key_algorithm,
    dm.key_length
FROM sys.databases db
JOIN sys.dm_database_encryption_keys dm
    ON db.database_id = dm.database_id;

What TDE protects:

Physical theft of database files or storage media
Unauthorized access to backup files and tapes
Decommissioned storage devices with residual data
Malicious storage administrators with filesystem access
In cloud environments: Provider employees with storage-level access

What TDE does NOT protect:

Database users with legitimate query access (data is decrypted for all queries)
Database administrators who can query any data
SQL injection attacks that operate through the application layer
Memory scraping attacks on the database server
Data exported or replicated to non-encrypted destinations

TDE ≠ Access Control

TDE is often misunderstood as a data access control mechanism. It is not. A user or application with SELECT privileges can read all data, encrypted or not—the database decrypts transparently. TDE protects against storage-layer threats, not application-layer or authorized user threats. For protecting data from privileged users (including DBAs), you need column-level or client-side encryption.

TDE Key Hierarchy and Management

TDE implementations use a hierarchical key structure for efficiency and security. Understanding this hierarchy is critical for proper key management and disaster recovery.

The three-tier key hierarchy:

Key Hierarchy Layers

•Master Key / Root Key — The top of the hierarchy. Protects all keys below it. Typically stored in a Key Management Service (KMS) or Hardware Security Module (HSM). This key is the ultimate trust anchor.
•Key Encryption Key (KEK) — Protects Data Encryption Keys. Often implemented as database certificates, server keys, or KMS keys. Enables key rotation without re-encrypting all data.
•Data Encryption Key (DEK) — The actual key used to encrypt/decrypt data pages. Stored encrypted (by the KEK) within the database. Held in memory during database operation.

Why a hierarchy?

The key hierarchy serves several critical purposes:

Key Rotation Efficiency — When you rotate the master key or KEK, you only need to re-encrypt the DEKs (tiny), not all the data (potentially terabytes). This makes routine key rotation practical.
Blast Radius Limitation — Different databases can have different DEKs, all protected by the same KEK. Compromise of one DEK doesn't expose other databases.
Access Control Layers — Different teams can manage different layers. A security team controls the master key; DBAs manage certificates; the database engine handles DEKs.
Recovery Point Separation — Backing up the KEK separately from database backups means you need both to recover data, preventing single points of exposure.

Key Management Options by Database
Database	Local Keys	Cloud KMS Integration	HSM Support
SQL Server	Database master key, certificates	Azure Key Vault, AWS KMS (via EKM)	Yes (EKM provider)
Oracle	Oracle Wallet, auto-login wallet	OCI Vault, AWS KMS	Yes (TDE HSM)
MySQL	Keyring file, keyring encrypted file	Keyring AWS KMS, Keyring OCI	keyring_hashicorp plugin
PostgreSQL	pgcrypto (application-managed)	RDS KMS, Cloud SQL CMEK	Via application layer
MongoDB	Local key file	AWS KMS, Azure Key Vault, GCP KMS	KMIP integration

Modern Best Practice: External KMS

Modern deployments should use an external Key Management Service rather than local key files or certificates stored on the database server. Cloud KMS (AWS KMS, Google Cloud KMS, Azure Key Vault) offers automatic key rotation, access auditing, and ensures that the master key is never on the same machine as the encrypted data. This separation is critical for true security.

Column-Level Encryption

Column-level encryption (also called cell-level encryption) encrypts specific columns within a table rather than entire data files. This provides granular protection for the most sensitive data while leaving other columns in plaintext for efficient querying.

When to use column-level encryption:

You need to protect data even from database administrators
Compliance requirements mandate encryption of specific data elements (SSN, credit card, PHI)
You need different access permissions for different columns
You're implementing field-level access control
You want encryption to persist through database exports and replication

Column Encryption Examples
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- Always Encrypted is SQL Server's client-side encryption feature
-- Keys never leave the application; database never sees plaintext
 
-- Step 1: Create Column Master Key (stored in Azure Key Vault or cert store)
CREATE COLUMN MASTER KEY CMK_Main
WITH (
    KEY_STORE_PROVIDER_NAME = 'AZURE_KEY_VAULT',
    KEY_PATH = 'https://myvault.vault.azure.net/keys/MyCMK/abc123'
);
 
-- Step 2: Create Column Encryption Key
CREATE COLUMN ENCRYPTION KEY CEK_SSN
WITH VALUES (
    COLUMN_MASTER_KEY = CMK_Main,
    ALGORITHM = 'RSA_OAEP',
    ENCRYPTED_VALUE = 0x01...  -- Encrypted key value
);
 
-- Step 3: Create table with encrypted columns
CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    Name NVARCHAR(100),
    SSN NVARCHAR(11) ENCRYPTED WITH (
        COLUMN_ENCRYPTION_KEY = CEK_SSN,
        ENCRYPTION_TYPE = DETERMINISTIC,  -- Allows equality comparisons
        ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256'
    ),
    CreditCard NVARCHAR(20) ENCRYPTED WITH (
        COLUMN_ENCRYPTION_KEY = CEK_SSN,
        ENCRYPTION_TYPE = RANDOMIZED,  -- More secure, but no comparisons
        ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256'
    )
);
 
-- Application connection string includes encryption flag
-- "Column Encryption Setting=Enabled"

Column Encryption Advantages

•Protects sensitive data from DBAs (with client-side)
•Encryption persists in exports and replication
•Fine-grained access control per column
•Satisfies strict compliance requirements
•Can combine with TDE for defense in depth

Column Encryption Challenges

•Cannot create efficient indexes on encrypted columns
•Range queries not possible (only equality w/ deterministic)
•Requires application changes for client-side
•Key management complexity increases
•Performance overhead for encryption/decryption

Deterministic vs. Randomized Encryption

Deterministic encryption always produces the same ciphertext for the same plaintext, enabling equality comparisons and indexing. However, it leaks information—attackers can see when two values are equal. Randomized encryption produces different ciphertext each time, revealing nothing, but prevents any server-side querying. Choose based on your threat model: deterministic for searchable fields where you accept some information leakage; randomized for maximum security.

Client-Side Encryption: The Strongest Guarantee

Client-side encryption (also called application-level encryption) is the strongest form of database encryption. The application encrypts data before sending it to the database and decrypts it after retrieval. The database never sees plaintext data—it only stores and returns encrypted blobs.

The trust boundary shifts fundamentally:

TDE trust boundary: The database server
Column encryption trust boundary: Typically still the database server
Client-side encryption trust boundary: Your application

With client-side encryption, even a fully compromised database server cannot access plaintext data. This provides protection against:

Malicious database administrators
Database server compromise
Cloud provider access
Database vendor backdoors (theoretical but considered in high-security contexts)
Court orders served to the database hosting provider (data is encrypted at rest and they don't have keys)

Client-Side Encryption Implementation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
import { createCipheriv, createDecipheriv, randomBytes, scrypt } from 'crypto';
import { promisify } from 'util';
 
const scryptAsync = promisify(scrypt);
 
// In production, get this from a KMS, not environment variables
const ENCRYPTION_KEY = process.env.DATA_ENCRYPTION_KEY!;
 
interface EncryptedField {
    iv: string;       // Initialization vector (unique per encryption)
    data: string;     // Encrypted data
    authTag: string;  // Authentication tag for integrity
}
 
async function deriveKey(password: string, salt: Buffer): Promise<Buffer> {
    return (await scryptAsync(password, salt, 32)) as Buffer;
}
 
// Encrypt sensitive field before storing in database
export async function encryptField(plaintext: string): Promise<EncryptedField> {
    const iv = randomBytes(16);
    const salt = randomBytes(16);
    const key = await deriveKey(ENCRYPTION_KEY, salt);
    
    const cipher = createCipheriv('aes-256-gcm', key, iv);
    
    let encrypted = cipher.update(plaintext, 'utf8', 'base64');
    encrypted += cipher.final('base64');
    
    const authTag = cipher.getAuthTag();
    
    return {
        iv: Buffer.concat([salt, iv]).toString('base64'),
        data: encrypted,
        authTag: authTag.toString('base64'),
    };
}
 
// Decrypt field after retrieving from database
export async function decryptField(encrypted: EncryptedField): Promise<string> {
    const ivBuffer = Buffer.from(encrypted.iv, 'base64');
    const salt = ivBuffer.slice(0, 16);
    const iv = ivBuffer.slice(16);
    
    const key = await deriveKey(ENCRYPTION_KEY, salt);
    
    const decipher = createDecipheriv('aes-256-gcm', key, iv);
    decipher.setAuthTag(Buffer.from(encrypted.authTag, 'base64'));
    
    let decrypted = decipher.update(encrypted.data, 'base64', 'utf8');
    decrypted += decipher.final('utf8');
    
    return decrypted;
}
 
// Usage example
async function savePatientRecord(db: Database, patient: PatientInput) {
    // Encrypt sensitive fields BEFORE database insertion
    const encryptedSSN = await encryptField(patient.ssn);
    const encryptedDiagnosis = await encryptField(patient.diagnosis);
    
    await db.query(`
        INSERT INTO patients (name, ssn_encrypted, ssn_iv, ssn_tag, 
                             diagnosis_encrypted, diagnosis_iv, diagnosis_tag)
        VALUES ($1, $2, $3, $4, $5, $6, $7)
    `, [
        patient.name,  // Not encrypted - allows searching
        encryptedSSN.data, encryptedSSN.iv, encryptedSSN.authTag,
        encryptedDiagnosis.data, encryptedDiagnosis.iv, encryptedDiagnosis.authTag
    ]);
}
 
async function getPatientRecord(db: Database, id: number): Promise<Patient> {
    const row = await db.queryOne('SELECT * FROM patients WHERE id = $1', [id]);
    
    // Decrypt sensitive fields AFTER retrieval
    const ssn = await decryptField({
        data: row.ssn_encrypted,
        iv: row.ssn_iv,
        authTag: row.ssn_tag
    });
    
    const diagnosis = await decryptField({
        data: row.diagnosis_encrypted,
        iv: row.diagnosis_iv,
        authTag: row.diagnosis_tag
    });
    
    return { id: row.id, name: row.name, ssn, diagnosis };
}

Use Authenticated Encryption

Always use authenticated encryption modes like AES-GCM or ChaCha20-Poly1305. These provide both confidentiality (data is encrypted) and integrity (tampering is detected). Without authentication, an attacker might modify encrypted data in ways that produce malicious results when decrypted. The 'auth tag' in our examples is what provides this integrity guarantee.

Searchable Encryption and Advanced Techniques

A major challenge with encrypted data is that you cannot query it efficiently. Indexes don't work on random-looking ciphertext. This has driven research into searchable encryption—techniques that allow some query capabilities over encrypted data.

Primary approaches:

Searchable Encryption Techniques
Technique	Capabilities	Security Trade-off	Use Case
Deterministic Encryption	Equality queries, indexing	Reveals when values are equal	Unique identifiers, exact lookups
Order-Preserving Encryption (OPE)	Range queries, sorting	Reveals ordering of values	Dates, numeric ranges (with caution)
Homomorphic Encryption	Computation on encrypted data	Very high performance overhead	Analytics without decrypting (emerging)
Searchable Symmetric Encryption (SSE)	Keyword search on encrypted documents	Leaks access patterns	Document search, log analysis
Blind Indexes	Equality queries via hash	Reveals index collision patterns	Searching encrypted fields

Blind indexes: A practical middle ground

Blind indexing is a commonly used technique that enables searching encrypted data without fully revealing the plaintext. The approach:

When encrypting a field, also compute a cryptographic hash (HMAC) of the plaintext using a separate key
Store this hash as a "blind index" alongside the encrypted value
To search, compute the HMAC of the search term and query the blind index
The blind index reveals nothing about the plaintext (it's a hash), but allows efficient lookups

Limitations:

Only supports equality queries (exact match)
The blind index key must be protected as carefully as the encryption key
Reveals when two values are the same (same hash)
Vulnerable to rainbow table attacks if the value space is small

Blind Index Implementation
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
import { createHmac, timingSafeEqual } from 'crypto';
 
interface EncryptedWithIndex {
    encryptedValue: EncryptedField;
    blindIndex: string;
}
 
// Separate key for blind indexes - NEVER use the encryption key
const BLIND_INDEX_KEY = process.env.BLIND_INDEX_KEY!;
 
function computeBlindIndex(plaintext: string): string {
    // Use HMAC-SHA256 for the blind index
    const hmac = createHmac('sha256', BLIND_INDEX_KEY);
    hmac.update(plaintext.toLowerCase().trim()); // Normalize input
    
    // Return first 16 bytes (128 bits) as hex
    // Full hash would be more unique but larger storage
    return hmac.digest('hex').substring(0, 32);
}
 
async function encryptWithBlindIndex(plaintext: string): Promise<EncryptedWithIndex> {
    return {
        encryptedValue: await encryptField(plaintext),
        blindIndex: computeBlindIndex(plaintext),
    };
}
 
// Saving with blind index
async function savePatientWithIndex(db: Database, patient: PatientInput) {
    const encryptedSSN = await encryptWithBlindIndex(patient.ssn);
    
    await db.query(`
        INSERT INTO patients (name, ssn_encrypted, ssn_iv, ssn_tag, ssn_blind_index)
        VALUES ($1, $2, $3, $4, $5)
    `, [
        patient.name,
        encryptedSSN.encryptedValue.data,
        encryptedSSN.encryptedValue.iv,
        encryptedSSN.encryptedValue.authTag,
        encryptedSSN.blindIndex,  // Stored for searching
    ]);
}
 
// Searching by encrypted field using blind index
async function findPatientBySSN(db: Database, ssn: string): Promise<Patient | null> {
    const searchIndex = computeBlindIndex(ssn);
    
    // Query uses the blind index, not the encrypted value
    const row = await db.queryOne(`
        SELECT * FROM patients WHERE ssn_blind_index = $1
    `, [searchIndex]);
    
    if (!row) return null;
    
    // Decrypt the actual value after finding the row
    const decryptedSSN = await decryptField({
        data: row.ssn_encrypted,
        iv: row.ssn_iv,
        authTag: row.ssn_tag,
    });
    
    return { id: row.id, name: row.name, ssn: decryptedSSN };
}
 
// CREATE INDEX idx_patients_ssn_blind ON patients(ssn_blind_index);
// Now searches are O(log n) not O(n)!

Searchable Encryption Leakage

All searchable encryption techniques leak some information. Deterministic encryption leaks value equality. Order-preserving encryption leaks ordering. Blind indexes leak collision patterns. For highly sensitive data where any leakage is unacceptable, you may need to accept that searching requires full table scans with in-application decryption and filtering. Always evaluate whether the leakage is acceptable for your threat model.

Cloud Database Encryption Options

Cloud databases offer managed encryption options that significantly simplify implementation. Understanding these options—and their limitations—is critical for cloud-native architectures.

AWS RDS Encryption:

AWS RDS/Aurora Encryption

•At-rest encryption enabled at instance creation (cannot be changed after)
•Uses AES-256 encryption for data, logs, automated backups, read replicas, and snapshots
•AWS-managed keys (default) or Customer-managed CMKs in KMS
•With CMKs, you control key rotation, access policies, and can revoke access
•Encryption/decryption handled transparently by RDS—no application changes
•Limitation: RDS (AWS) has access to unencrypted data; protects against storage theft only

Cloud Database Encryption Comparison
Provider/Service	Default Encryption	Customer-Managed Keys	Client-Side Encryption
AWS RDS/Aurora	AWS-managed key	AWS KMS CMK	Application responsibility
AWS DynamoDB	AWS-owned key (default)	AWS KMS CMK	DynamoDB Encryption Client
Google Cloud SQL	Google-managed key	Cloud KMS CMEK	Application responsibility
Google Firestore	Google-managed key	Cloud KMS CMEK	Client library support
Azure SQL Database	Service-managed TDE	Azure Key Vault (BYOK)	Always Encrypted
Azure Cosmos DB	Service-managed key	Azure Key Vault CMK	Client-side encryption SDK
MongoDB Atlas	Automatic encryption	AWS/GCP/Azure KMS	Client-Side Field Level Encryption

When to use Customer-Managed Keys (CMK/CMEK/BYOK):

Regulatory requirements mandate customer key control
Revocation capability needed—ability to make data inaccessible by deleting the key
Multi-cloud or hybrid deployments where you want consistent key management
Separation of duties—security team controls keys, ops team manages databases
Audit requirements beyond what provider logging offers

When default provider-managed encryption is sufficient:

Compliance frameworks accept provider-managed encryption (most do)
You trust the cloud provider's security practices
Simpler operations with no key management overhead
Lower cost (CMK/HSM keys have additional charges)

The Cloud Provider Trust Model

Even with customer-managed keys, cloud providers have technical access to unencrypted data while it's being processed. The keys are used by the provider's systems to encrypt/decrypt on your behalf. For true zero-trust where even the cloud provider cannot access data, you need client-side encryption with keys that never leave your environment. This is a fundamental architectural decision with significant complexity implications.

Summary: Database Encryption

We've explored the full spectrum of database encryption approaches. Let's consolidate the key insights:

Key Takeaways

•TDE is the baseline — Transparent Data Encryption protects against storage-level threats with minimal implementation effort. Every production database should have TDE or equivalent.
•TDE has limitations — It doesn't protect against authorized access abuse, SQL injection, or compromised database servers. Legitimate users see decrypted data.
•Column encryption adds granularity — Encrypt specific sensitive columns when you need per-column access control or compliance requires field-level protection.
•Client-side encryption is strongest — When even DBAs shouldn't see data, encrypt in the application before sending to the database. The trust boundary moves to your application.
•Searchability is the tradeoff — Encrypted data can't be efficiently queried. Blind indexes and deterministic encryption help, but leak some information.
•Key hierarchy is critical — TDE uses DEKs protected by KEKs protected by master keys. This enables efficient key rotation and controls blast radius.
•Cloud databases offer managed options — Provider-managed encryption is simple; customer-managed keys add control. Choose based on compliance and trust requirements.
•Layer your encryption — Best practice is TDE as a baseline plus column/client-side encryption for the most sensitive data.

What's next:

Having covered database encryption, we'll next explore file system encryption—protecting unstructured data, configuration files, logs, and any data that lives outside the database. We'll examine full-disk encryption, file-level encryption, and the unique challenges of encrypting data that applications directly read and write.

Page Complete

You now understand the full range of database encryption options—from transparent TDE to client-side field encryption. You can evaluate tradeoffs between transparency and protection depth, understand key hierarchies, implement blind indexes for searchability, and choose appropriate cloud encryption options. Next, we'll extend these concepts to file system encryption.