Loading content...
Databases are the crown jewels of most organizations. They contain customer records, financial transactions, intellectual property, healthcare data, and the operational state of business-critical systems. When attackers breach systems, databases are typically the ultimate target—the place where the most valuable data resides in a conveniently queryable format.
The database encryption challenge is unique:
Unlike file systems or block storage, databases have sophisticated internal structures. Indexes enable fast queries. Relationships link tables together. Query optimizers plan execution paths. Transaction logs ensure durability. Encryption must protect data while preserving these capabilities—a far more complex challenge than simply encrypting files on disk.
This page explores the spectrum of database encryption approaches, from the simplest transparent solutions to granular field-level protection, helping you understand which approach fits your specific requirements.
By the end of this page, you will understand the full range of database encryption options—Transparent Data Encryption (TDE), column-level encryption, tablespace encryption, and client-side encryption. You'll learn how each approach works internally, their security guarantees, performance implications, and how to choose the right approach for your compliance and security requirements.
Database encryption can be implemented at multiple granularities in the data stack. Each approach has distinct characteristics regarding performance, security guarantees, and operational complexity.
The four primary approaches:
| Approach | Description | Encrypted Layer | Who Controls Keys |
|---|---|---|---|
| Transparent Data Encryption (TDE) | Database encrypts data files transparently | Data files, logs, backups | Database administrator or KMS |
| Column/Cell-Level Encryption | Specific columns are encrypted individually | Selected columns within tables | Application or database |
| Tablespace Encryption | Entire tablespaces (groups of tables) encrypted | Tablespace data files | Database administrator |
| Client-Side Encryption | Application encrypts before sending to database | Selected fields or records | Application |
Understanding the tradeoffs:
The fundamental tradeoff is between transparency and protection depth:
More transparent solutions (TDE, tablespace encryption) are easier to implement and require no application changes, but they primarily protect against storage-level threats—the database server process has access to unencrypted data.
More granular solutions (column-level, client-side) require more implementation effort but can protect data even from database administrators and internal threats.
Most enterprise deployments combine approaches: TDE as a baseline for compliance and defense against physical threats, plus column-level or client-side encryption for the most sensitive data like social security numbers, credit cards, or protected health information.
A fundamental challenge of database encryption is that encrypted data cannot be efficiently queried. You cannot create an index on encrypted values (they appear random), you cannot perform range queries, and you cannot use SQL functions on encrypted columns. Advanced techniques like searchable encryption and order-preserving encryption attempt to bridge this gap, but with significant tradeoffs. We'll explore these later in the page.
Transparent Data Encryption (TDE) is the most widely deployed database encryption approach. It encrypts data files, transaction logs, and backups at the database engine level, without requiring any application changes. The 'transparent' name comes from the fact that applications continue to issue queries exactly as before—encryption and decryption happen automatically within the database engine.
How TDE works internally:
1234567891011121314151617181920212223242526
-- Step 1: Create a database master key (KEK)CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'StrongPassword123!'; -- Step 2: Create a certificate to protect the DEKCREATE CERTIFICATE TDE_Certificate WITH SUBJECT = 'TDE Certificate'; -- Step 3: Create the database encryption key (DEK)USE MyDatabase;CREATE DATABASE ENCRYPTION KEY WITH ALGORITHM = AES_256 ENCRYPTION BY SERVER CERTIFICATE TDE_Certificate; -- Step 4: Enable TDE on the databaseALTER DATABASE MyDatabase SET ENCRYPTION ON; -- Verify encryption statusSELECT db.name, db.is_encrypted, dm.encryption_state, dm.percent_complete, dm.key_algorithm, dm.key_lengthFROM sys.databases dbJOIN sys.dm_database_encryption_keys dm ON db.database_id = dm.database_id;What TDE protects:
What TDE does NOT protect:
TDE is often misunderstood as a data access control mechanism. It is not. A user or application with SELECT privileges can read all data, encrypted or not—the database decrypts transparently. TDE protects against storage-layer threats, not application-layer or authorized user threats. For protecting data from privileged users (including DBAs), you need column-level or client-side encryption.
TDE implementations use a hierarchical key structure for efficiency and security. Understanding this hierarchy is critical for proper key management and disaster recovery.
The three-tier key hierarchy:
Why a hierarchy?
The key hierarchy serves several critical purposes:
Key Rotation Efficiency — When you rotate the master key or KEK, you only need to re-encrypt the DEKs (tiny), not all the data (potentially terabytes). This makes routine key rotation practical.
Blast Radius Limitation — Different databases can have different DEKs, all protected by the same KEK. Compromise of one DEK doesn't expose other databases.
Access Control Layers — Different teams can manage different layers. A security team controls the master key; DBAs manage certificates; the database engine handles DEKs.
Recovery Point Separation — Backing up the KEK separately from database backups means you need both to recover data, preventing single points of exposure.
| Database | Local Keys | Cloud KMS Integration | HSM Support |
|---|---|---|---|
| SQL Server | Database master key, certificates | Azure Key Vault, AWS KMS (via EKM) | Yes (EKM provider) |
| Oracle | Oracle Wallet, auto-login wallet | OCI Vault, AWS KMS | Yes (TDE HSM) |
| MySQL | Keyring file, keyring encrypted file | Keyring AWS KMS, Keyring OCI | keyring_hashicorp plugin |
| PostgreSQL | pgcrypto (application-managed) | RDS KMS, Cloud SQL CMEK | Via application layer |
| MongoDB | Local key file | AWS KMS, Azure Key Vault, GCP KMS | KMIP integration |
Modern deployments should use an external Key Management Service rather than local key files or certificates stored on the database server. Cloud KMS (AWS KMS, Google Cloud KMS, Azure Key Vault) offers automatic key rotation, access auditing, and ensures that the master key is never on the same machine as the encrypted data. This separation is critical for true security.
Column-level encryption (also called cell-level encryption) encrypts specific columns within a table rather than entire data files. This provides granular protection for the most sensitive data while leaving other columns in plaintext for efficient querying.
When to use column-level encryption:
123456789101112131415161718192021222324252627282930313233343536
-- Always Encrypted is SQL Server's client-side encryption feature-- Keys never leave the application; database never sees plaintext -- Step 1: Create Column Master Key (stored in Azure Key Vault or cert store)CREATE COLUMN MASTER KEY CMK_MainWITH ( KEY_STORE_PROVIDER_NAME = 'AZURE_KEY_VAULT', KEY_PATH = 'https://myvault.vault.azure.net/keys/MyCMK/abc123'); -- Step 2: Create Column Encryption KeyCREATE COLUMN ENCRYPTION KEY CEK_SSNWITH VALUES ( COLUMN_MASTER_KEY = CMK_Main, ALGORITHM = 'RSA_OAEP', ENCRYPTED_VALUE = 0x01... -- Encrypted key value); -- Step 3: Create table with encrypted columnsCREATE TABLE Customers ( CustomerID INT PRIMARY KEY, Name NVARCHAR(100), SSN NVARCHAR(11) ENCRYPTED WITH ( COLUMN_ENCRYPTION_KEY = CEK_SSN, ENCRYPTION_TYPE = DETERMINISTIC, -- Allows equality comparisons ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256' ), CreditCard NVARCHAR(20) ENCRYPTED WITH ( COLUMN_ENCRYPTION_KEY = CEK_SSN, ENCRYPTION_TYPE = RANDOMIZED, -- More secure, but no comparisons ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256' )); -- Application connection string includes encryption flag-- "Column Encryption Setting=Enabled"Deterministic encryption always produces the same ciphertext for the same plaintext, enabling equality comparisons and indexing. However, it leaks information—attackers can see when two values are equal. Randomized encryption produces different ciphertext each time, revealing nothing, but prevents any server-side querying. Choose based on your threat model: deterministic for searchable fields where you accept some information leakage; randomized for maximum security.
Client-side encryption (also called application-level encryption) is the strongest form of database encryption. The application encrypts data before sending it to the database and decrypts it after retrieval. The database never sees plaintext data—it only stores and returns encrypted blobs.
The trust boundary shifts fundamentally:
With client-side encryption, even a fully compromised database server cannot access plaintext data. This provides protection against:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
import { createCipheriv, createDecipheriv, randomBytes, scrypt } from 'crypto';import { promisify } from 'util'; const scryptAsync = promisify(scrypt); // In production, get this from a KMS, not environment variablesconst ENCRYPTION_KEY = process.env.DATA_ENCRYPTION_KEY!; interface EncryptedField { iv: string; // Initialization vector (unique per encryption) data: string; // Encrypted data authTag: string; // Authentication tag for integrity} async function deriveKey(password: string, salt: Buffer): Promise<Buffer> { return (await scryptAsync(password, salt, 32)) as Buffer;} // Encrypt sensitive field before storing in databaseexport async function encryptField(plaintext: string): Promise<EncryptedField> { const iv = randomBytes(16); const salt = randomBytes(16); const key = await deriveKey(ENCRYPTION_KEY, salt); const cipher = createCipheriv('aes-256-gcm', key, iv); let encrypted = cipher.update(plaintext, 'utf8', 'base64'); encrypted += cipher.final('base64'); const authTag = cipher.getAuthTag(); return { iv: Buffer.concat([salt, iv]).toString('base64'), data: encrypted, authTag: authTag.toString('base64'), };} // Decrypt field after retrieving from databaseexport async function decryptField(encrypted: EncryptedField): Promise<string> { const ivBuffer = Buffer.from(encrypted.iv, 'base64'); const salt = ivBuffer.slice(0, 16); const iv = ivBuffer.slice(16); const key = await deriveKey(ENCRYPTION_KEY, salt); const decipher = createDecipheriv('aes-256-gcm', key, iv); decipher.setAuthTag(Buffer.from(encrypted.authTag, 'base64')); let decrypted = decipher.update(encrypted.data, 'base64', 'utf8'); decrypted += decipher.final('utf8'); return decrypted;} // Usage exampleasync function savePatientRecord(db: Database, patient: PatientInput) { // Encrypt sensitive fields BEFORE database insertion const encryptedSSN = await encryptField(patient.ssn); const encryptedDiagnosis = await encryptField(patient.diagnosis); await db.query(` INSERT INTO patients (name, ssn_encrypted, ssn_iv, ssn_tag, diagnosis_encrypted, diagnosis_iv, diagnosis_tag) VALUES ($1, $2, $3, $4, $5, $6, $7) `, [ patient.name, // Not encrypted - allows searching encryptedSSN.data, encryptedSSN.iv, encryptedSSN.authTag, encryptedDiagnosis.data, encryptedDiagnosis.iv, encryptedDiagnosis.authTag ]);} async function getPatientRecord(db: Database, id: number): Promise<Patient> { const row = await db.queryOne('SELECT * FROM patients WHERE id = $1', [id]); // Decrypt sensitive fields AFTER retrieval const ssn = await decryptField({ data: row.ssn_encrypted, iv: row.ssn_iv, authTag: row.ssn_tag }); const diagnosis = await decryptField({ data: row.diagnosis_encrypted, iv: row.diagnosis_iv, authTag: row.diagnosis_tag }); return { id: row.id, name: row.name, ssn, diagnosis };}Always use authenticated encryption modes like AES-GCM or ChaCha20-Poly1305. These provide both confidentiality (data is encrypted) and integrity (tampering is detected). Without authentication, an attacker might modify encrypted data in ways that produce malicious results when decrypted. The 'auth tag' in our examples is what provides this integrity guarantee.
A major challenge with encrypted data is that you cannot query it efficiently. Indexes don't work on random-looking ciphertext. This has driven research into searchable encryption—techniques that allow some query capabilities over encrypted data.
Primary approaches:
| Technique | Capabilities | Security Trade-off | Use Case |
|---|---|---|---|
| Deterministic Encryption | Equality queries, indexing | Reveals when values are equal | Unique identifiers, exact lookups |
| Order-Preserving Encryption (OPE) | Range queries, sorting | Reveals ordering of values | Dates, numeric ranges (with caution) |
| Homomorphic Encryption | Computation on encrypted data | Very high performance overhead | Analytics without decrypting (emerging) |
| Searchable Symmetric Encryption (SSE) | Keyword search on encrypted documents | Leaks access patterns | Document search, log analysis |
| Blind Indexes | Equality queries via hash | Reveals index collision patterns | Searching encrypted fields |
Blind indexes: A practical middle ground
Blind indexing is a commonly used technique that enables searching encrypted data without fully revealing the plaintext. The approach:
Limitations:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
import { createHmac, timingSafeEqual } from 'crypto'; interface EncryptedWithIndex { encryptedValue: EncryptedField; blindIndex: string;} // Separate key for blind indexes - NEVER use the encryption keyconst BLIND_INDEX_KEY = process.env.BLIND_INDEX_KEY!; function computeBlindIndex(plaintext: string): string { // Use HMAC-SHA256 for the blind index const hmac = createHmac('sha256', BLIND_INDEX_KEY); hmac.update(plaintext.toLowerCase().trim()); // Normalize input // Return first 16 bytes (128 bits) as hex // Full hash would be more unique but larger storage return hmac.digest('hex').substring(0, 32);} async function encryptWithBlindIndex(plaintext: string): Promise<EncryptedWithIndex> { return { encryptedValue: await encryptField(plaintext), blindIndex: computeBlindIndex(plaintext), };} // Saving with blind indexasync function savePatientWithIndex(db: Database, patient: PatientInput) { const encryptedSSN = await encryptWithBlindIndex(patient.ssn); await db.query(` INSERT INTO patients (name, ssn_encrypted, ssn_iv, ssn_tag, ssn_blind_index) VALUES ($1, $2, $3, $4, $5) `, [ patient.name, encryptedSSN.encryptedValue.data, encryptedSSN.encryptedValue.iv, encryptedSSN.encryptedValue.authTag, encryptedSSN.blindIndex, // Stored for searching ]);} // Searching by encrypted field using blind indexasync function findPatientBySSN(db: Database, ssn: string): Promise<Patient | null> { const searchIndex = computeBlindIndex(ssn); // Query uses the blind index, not the encrypted value const row = await db.queryOne(` SELECT * FROM patients WHERE ssn_blind_index = $1 `, [searchIndex]); if (!row) return null; // Decrypt the actual value after finding the row const decryptedSSN = await decryptField({ data: row.ssn_encrypted, iv: row.ssn_iv, authTag: row.ssn_tag, }); return { id: row.id, name: row.name, ssn: decryptedSSN };} // CREATE INDEX idx_patients_ssn_blind ON patients(ssn_blind_index);// Now searches are O(log n) not O(n)!All searchable encryption techniques leak some information. Deterministic encryption leaks value equality. Order-preserving encryption leaks ordering. Blind indexes leak collision patterns. For highly sensitive data where any leakage is unacceptable, you may need to accept that searching requires full table scans with in-application decryption and filtering. Always evaluate whether the leakage is acceptable for your threat model.
Cloud databases offer managed encryption options that significantly simplify implementation. Understanding these options—and their limitations—is critical for cloud-native architectures.
AWS RDS Encryption:
| Provider/Service | Default Encryption | Customer-Managed Keys | Client-Side Encryption |
|---|---|---|---|
| AWS RDS/Aurora | AWS-managed key | AWS KMS CMK | Application responsibility |
| AWS DynamoDB | AWS-owned key (default) | AWS KMS CMK | DynamoDB Encryption Client |
| Google Cloud SQL | Google-managed key | Cloud KMS CMEK | Application responsibility |
| Google Firestore | Google-managed key | Cloud KMS CMEK | Client library support |
| Azure SQL Database | Service-managed TDE | Azure Key Vault (BYOK) | Always Encrypted |
| Azure Cosmos DB | Service-managed key | Azure Key Vault CMK | Client-side encryption SDK |
| MongoDB Atlas | Automatic encryption | AWS/GCP/Azure KMS | Client-Side Field Level Encryption |
When to use Customer-Managed Keys (CMK/CMEK/BYOK):
When default provider-managed encryption is sufficient:
Even with customer-managed keys, cloud providers have technical access to unencrypted data while it's being processed. The keys are used by the provider's systems to encrypt/decrypt on your behalf. For true zero-trust where even the cloud provider cannot access data, you need client-side encryption with keys that never leave your environment. This is a fundamental architectural decision with significant complexity implications.
We've explored the full spectrum of database encryption approaches. Let's consolidate the key insights:
What's next:
Having covered database encryption, we'll next explore file system encryption—protecting unstructured data, configuration files, logs, and any data that lives outside the database. We'll examine full-disk encryption, file-level encryption, and the unique challenges of encrypting data that applications directly read and write.
You now understand the full range of database encryption options—from transparent TDE to client-side field encryption. You can evaluate tradeoffs between transparency and protection depth, understand key hierarchies, implement blind indexes for searchability, and choose appropriate cloud encryption options. Next, we'll extend these concepts to file system encryption.