Loading content...
What if you could have the reliability of a battle-tested distributed database and the flexibility to implement any data model you need? What if MongoDB, PostgreSQL, and Neo4j could all run on the same underlying storage engine, sharing the same consistency guarantees, the same operational tooling, and the same performance characteristics?
This isn't hypothetical—it's FoundationDB's layer architecture.
The insight is elegant: FoundationDB's core provides only the ordered key-value store with strict serializability. Everything else—document semantics, SQL queries, graph traversals, time-series optimizations—is implemented as a layer on top of this core. Layers are just libraries that translate higher-level operations into key-value reads and writes.
This architecture has profound implications:
In this page, we'll explore how layers work, examine several official and community layers, and understand why this architecture represents a fundamental advance in database design.
By the end of this page, you will understand: (1) How the layer architecture separates concerns between data model and distributed systems; (2) How specific layers (Record Layer, Document Layer, SQL layers) map their models to key-value; (3) How to think about designing your own layers; (4) The composition possibilities when multiple layers share a core; and (5) Trade-offs between using layers vs. raw key-value access.
Traditional databases are monolithic: the data model (SQL, documents, graphs), the storage engine, the distribution layer, and the query processor are all tightly integrated. This has advantages—tight integration enables optimizations. But it also has severe disadvantages:
FoundationDB's Inversion:
FoundationDB inverts this structure. The core is minimal and maximally reliable:
Traditional Database: FoundationDB:
┌─────────────────────┐ ┌─────────────────────┐
│ Query Layer │ │ Your Layer │←─ Data model
├─────────────────────┤ ├─────────────────────┤ semantics
│ Data Model │ │ Your Layer │←─ Document/SQL/
├─────────────────────┤ ├─────────────────────┤ Graph APIs
│ Distribution │ │ │
├─────────────────────┤ │ FoundationDB │←─ Distribution,
│ Storage Engine │ │ Core │ transactions,
├─────────────────────┤ │ (Key-Value) │ durability
│ Durability │ │ │
└─────────────────────┘ └─────────────────────┘
Monolithic: Layered:
All correctness concerns Core is proven correct.
are interleaved. Bugs Layers inherit correctness.
can exist anywhere. Bugs are contained.
What a Layer Does:
A layer is fundamentally a translation library:
Schema Translation: Maps the layer's data model (tables, documents, nodes/edges) to key-value key structures.
Operation Translation: Converts high-level operations (INSERT, find(), traverse) into key-value reads, writes, range scans.
Query Planning: If the layer supports queries, it determines which keys to read and in what order.
Index Management: Maintains secondary indexes by issuing additional key-value writes.
Critically, all of this happens within FoundationDB transactions. A layer wraps operations in transactions, so the ACID guarantees flow through automatically.
Unlike database plugins or extensions that run inside the database process, FoundationDB layers are client-side libraries. They run in your application process, translating your calls to FoundationDB operations. This means you can inspect, modify, or replace a layer without any changes to FoundationDB itself.
The Record Layer is FoundationDB's most sophisticated official layer, developed and open-sourced by Apple. It powers critical Apple infrastructure, handling billions of records for iCloud services. Understanding the Record Layer illuminates both what layers can achieve and how complex data models map to key-value.
What the Record Layer Provides:
How Records Map to Keys:
The Record Layer uses a carefully designed key structure:
RECORD LAYER KEY ARCHITECTURE═══════════════════════════════════════════════════════════════════ Record Store: Logical database within FoundationDB - Each record store has a unique prefix - Multiple isolated databases can coexist KEY STRUCTURE: 1. RECORD DATA: ┌──────────────────────────────────────────────────────────────┐ │ [store_prefix] / RECORD / [primary_key] → [serialized_proto] │ └──────────────────────────────────────────────────────────────┘ Example: /my_store/RECORD/(123,) → Customer{id:123, name:"Alice", email:...} /my_store/RECORD/(124,) → Customer{id:124, name:"Bob", email:...} - Records are stored ordered by primary key - Serialized using Protocol Buffers (compact, fast) - Range scans naturally iterate in key order 2. SECONDARY INDEXES: ┌────────────────────────────────────────────────────────────────┐ │ [store_prefix] / INDEX / [index_name] / [indexed_value(s)] │ │ / [primary_key] → (empty or covering data) │ └────────────────────────────────────────────────────────────────┘ Example (index on email): /my_store/INDEX/by_email/alice@example.com/(123,) → (empty) /my_store/INDEX/by_email/bob@example.com/(124,) → (empty) Query: "Find customer by email" 1. Range scan index: /my_store/INDEX/by_email/alice@example.com/ 2. Get primary key: (123,) 3. Fetch record: /my_store/RECORD/(123,) 3. UNIQUE INDEXES: ┌────────────────────────────────────────────────────────────────┐ │ [store_prefix] / UNIQUE / [index_name] / [indexed_value] → pk │ └────────────────────────────────────────────────────────────────┘ Example (unique email constraint): /my_store/UNIQUE/email/alice@example.com → (123,) - Only one primary key value allowed per indexed value - Enforced by transactional read-before-write 4. SCHEMA METADATA: ┌────────────────────────────────────────────────────────────────┐ │ [store_prefix] / META / SCHEMA → [schema_version, proto_def] │ │ [store_prefix] / META / INDEX_STATE / [index] → [state] │ └────────────────────────────────────────────────────────────────┘ - Tracks schema version for evolution support - Index state (BUILDING, READABLE, WRITE_ONLY) for online rebuildsQuery Execution:
The Record Layer includes a query planner that chooses execution strategies:
// Java example using Record Layer
RecordQuery query = RecordQuery.newBuilder()
.setRecordType("Customer")
.setFilter(Query.field("age").greaterThan(21))
.setSort(field("last_name"))
.build();
RecordCursor<FDBQueriedRecord<Message>> cursor =
recordStore.executeQuery(query);
// Under the hood:
// 1. Query planner checks for usable indexes
// 2. If age has an index: use index scan for age > 21, then sort
// 3. If no index: full scan with filter, then sort
// 4. Results streamed through RecordCursor (handles pagination)
Online Index Building:
A killer feature is online index building. You can add a new index to a table with billions of records without downtime:
All of this happens transactionally—no inconsistent states, no data loss.
Schema Evolution:
The Record Layer handles schema changes gracefully:
No expensive ALTER TABLE operations, no downtime migrations.
Apple uses the Record Layer for CloudKit, powering iCloud's database services. It handles schemas with hundreds of record types, indexes with billions of entries, and query workloads serving hundreds of millions of users. This isn't a toy layer—it's battle-tested infrastructure.
While the Record Layer is powerful, other use cases call for familiar APIs. FoundationDB has official and community layers that provide MongoDB-compatible document access and SQL interfaces.
The Document Layer:
The FoundationDB Document Layer implements the MongoDB wire protocol, allowing existing MongoDB applications to run against FoundationDB with minimal changes:
// MongoDB clients can connect directly
const client = new MongoClient('mongodb://localhost:27016');
const db = client.db('myapp');
const users = db.collection('users');
// These operations are translated to FoundationDB transactions
await users.insertOne({ _id: 'alice', email: 'alice@example.com' });
await users.findOne({ _id: 'alice' });
await users.updateOne({ _id: 'alice' }, { $set: { name: 'Alice Smith' } });
await users.deleteOne({ _id: 'alice' });
// This works because Document Layer:
// 1. Speaks MongoDB protocol
// 2. Translates operations to key-value
// 3. Wraps in FoundationDB transactions
How Document Layer Maps Documents to Keys:
DOCUMENT LAYER KEY STRUCTURE═══════════════════════════════════════════════════════════════════ Document: { _id: "alice", email: "alice@example.com", age: 30, address: { city: "NYC", zip: "10001" } } KEY-VALUE REPRESENTATION: Option 1: Whole-Document Storage────────────────────────────────/db/collection/documents/alice → {entire BSON document} Pros: Simple, single read for entire document Cons: Update requires rewriting entire document Option 2: Field-Level Storage (FoundationDB Document Layer approach)────────────────────────────────────────────────────────────────────/db/users/docs/alice/email → "alice@example.com"/db/users/docs/alice/age → 30 /db/users/docs/alice/address/city → "NYC"/db/users/docs/alice/address/zip → "10001" Pros: Partial updates are efficient Cons: Full document read requires range scan INDEXES:────────Secondary index on email: /db/users/idx/email/alice@example.com/alice → (empty) Index on nested field address.city: /db/users/idx/address.city/NYC/alice → (empty) EXAMPLE QUERIES:──────────────── db.users.find({ email: "alice@example.com" }) 1. Scan: /db/users/idx/email/alice@example.com/* 2. Get _id: "alice" 3. Range scan: /db/users/docs/alice/* (get all fields) 4. Assemble document db.users.find({ email: "alice@example.com" }, { email: 1 }) 1. Scan: /db/users/idx/email/alice@example.com/* 2. Get _id: "alice" 3. Single read: /db/users/docs/alice/email (only requested field) 4. Return partial doc db.users.updateOne({ _id: "alice" }, { $set: { age: 31 } }) 1. Single write: /db/users/docs/alice/age = 31 2. (No index update needed - age not indexed) db.users.updateOne({ _id: "alice" }, { $set: { email: "new@example.com" } }) 1. Read current: /db/users/docs/alice/email = "alice@example.com" 2. Delete old index: del /db/users/idx/email/alice@example.com/alice 3. Write new value: /db/users/docs/alice/email = "new@example.com" 4. Write new index: /db/users/idx/email/new@example.com/alice = ()SQL Layers:
Several SQL layers exist for FoundationDB:
1. FoundationDB SQL Layer (fdb-sql-layer) [Deprecated]
An older project that provided PostgreSQL wire protocol compatibility. It demonstrated the concept but is no longer maintained.
2. Community SQL Implementations
Various projects implement SQL parsing, planning, and execution against FoundationDB. These typically:
3. Snowflake's Custom Layer
Snowflake built a custom metadata layer on FoundationDB (not the same as a SQL layer, but instructive). Their needs:
They chose FoundationDB because they could design a layer perfectly suited to these specific requirements—something a general-purpose database couldn't match.
Why Build Custom vs. Use Existing DB?
You might wonder: if you need SQL, why not just use PostgreSQL? Valid question. Reasons to use a FoundationDB SQL layer:
While the Record Layer is production-hardened by Apple, other layers have varying maturity. The Document Layer is functional but less feature-complete than MongoDB. SQL layers are often experimental. Evaluate carefully before production use, and consider contributing to development!
One of FoundationDB's greatest strengths is that you can build your own layer. You're not limited to what FoundationDB or the community provides. If your application has specific data model needs, a custom layer gives you exactly the semantics you need with FoundationDB's guarantees.
When to Build a Custom Layer:
Layer Design Principles:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180
# ============================================# EXAMPLE: SIMPLE GRAPH LAYER# ============================================ import fdbfrom fdb.tuple import pack, unpack fdb.api_version(720)db = fdb.open() class GraphLayer: """ A custom graph layer supporting nodes, edges, and traversals. Demonstrates key design for graph data model. """ def __init__(self, subspace_prefix): """Initialize with a subspace to isolate this graph.""" self.nodes = fdb.Subspace((subspace_prefix, 'nodes')) self.edges_out = fdb.Subspace((subspace_prefix, 'edges_out')) self.edges_in = fdb.Subspace((subspace_prefix, 'edges_in')) self.node_props = fdb.Subspace((subspace_prefix, 'node_props')) self.edge_props = fdb.Subspace((subspace_prefix, 'edge_props')) @fdb.transactional def create_node(self, tr, node_id, properties=None): """ Create a node with optional properties. """ # Check if node exists if tr[self.nodes[node_id]] is not None: raise ValueError(f"Node {node_id} already exists") # Mark node as existing tr[self.nodes[node_id]] = b'1' # Store properties if properties: for key, value in properties.items(): tr[self.node_props[node_id][key]] = value.encode() if isinstance(value, str) else pack((value,)) return node_id @fdb.transactional def create_edge(self, tr, from_node, to_node, edge_type, properties=None): """ Create a directed edge between nodes. Stores in both directions for efficient traversal. """ # Verify nodes exist if tr[self.nodes[from_node]] is None: raise ValueError(f"Source node {from_node} does not exist") if tr[self.nodes[to_node]] is None: raise ValueError(f"Target node {to_node} does not exist") # Create unique edge ID edge_id = f"{from_node}-{edge_type}-{to_node}" # Store outgoing edge (for forward traversal) # Key: (from_node, edge_type, to_node) tr[self.edges_out[from_node][edge_type][to_node]] = edge_id.encode() # Store incoming edge (for reverse traversal) # Key: (to_node, edge_type, from_node) tr[self.edges_in[to_node][edge_type][from_node]] = edge_id.encode() # Store edge properties if properties: for key, value in properties.items(): tr[self.edge_props[edge_id][key]] = value.encode() if isinstance(value, str) else pack((value,)) return edge_id @fdb.transactional def get_outgoing(self, tr, node_id, edge_type=None, limit=100): """ Get nodes reachable via outgoing edges. Optionally filter by edge type. """ results = [] if edge_type: # Specific edge type prefix = self.edges_out[node_id][edge_type] else: # All edge types prefix = self.edges_out[node_id] for key, _ in tr.get_range( prefix.range().start, prefix.range().stop, limit=limit ): # Parse key to get target node key_tuple = self.edges_out.unpack(key) if edge_type: # key_tuple = (node_id, edge_type, to_node) results.append({'node': key_tuple[2], 'type': edge_type}) else: # key_tuple = (node_id, edge_type, to_node) results.append({'node': key_tuple[2], 'type': key_tuple[1]}) return results @fdb.transactional def get_incoming(self, tr, node_id, edge_type=None, limit=100): """ Get nodes with edges pointing to this node. """ results = [] if edge_type: prefix = self.edges_in[node_id][edge_type] else: prefix = self.edges_in[node_id] for key, _ in tr.get_range( prefix.range().start, prefix.range().stop, limit=limit ): key_tuple = self.edges_in.unpack(key) if edge_type: results.append({'node': key_tuple[2], 'type': edge_type}) else: results.append({'node': key_tuple[2], 'type': key_tuple[1]}) return results @fdb.transactional def traverse(self, tr, start_node, edge_types, depth=2): """ Multi-hop traversal following specified edge types. Returns all reachable nodes within depth. """ visited = set() current_layer = {start_node} all_reached = set() for d in range(depth): next_layer = set() for node in current_layer: if node in visited: continue visited.add(node) for edge_type in edge_types: neighbors = self.get_outgoing(tr, node, edge_type) for n in neighbors: next_layer.add(n['node']) all_reached.add(n['node']) current_layer = next_layer return list(all_reached - {start_node}) # ============================================# USAGE EXAMPLE# ============================================ graph = GraphLayer('social') # Create social networkgraph.create_node(db, 'alice', {'name': 'Alice', 'age': 30})graph.create_node(db, 'bob', {'name': 'Bob', 'age': 25})graph.create_node(db, 'carol', {'name': 'Carol', 'age': 35}) graph.create_edge(db, 'alice', 'bob', 'FOLLOWS')graph.create_edge(db, 'alice', 'carol', 'FOLLOWS')graph.create_edge(db, 'bob', 'carol', 'FOLLOWS')graph.create_edge(db, 'carol', 'alice', 'FOLLOWS') # Query: Who does Alice follow?following = graph.get_outgoing(db, 'alice', 'FOLLOWS')# Returns: [{'node': 'bob', 'type': 'FOLLOWS'}, {'node': 'carol', 'type': 'FOLLOWS'}] # Query: Friends of friends (2-hop traversal)friends_of_friends = graph.traverse(db, 'alice', ['FOLLOWS'], depth=2)# Returns: ['bob', 'carol'] (all reachable within 2 hops)Key Design Decisions When Building Layers:
1. Key Structure for Access Patterns
The graph layer above stores edges in two directions (edges_out and edges_in) because traversal needs are bidirectional. This is denormalization for query performance—the same data is stored twice for different access patterns.
2. Atomicity Boundaries
Decide what operations should be atomic. In the graph example:
3. ID Generation
How are entities identified?
4. Schema Flexibility vs. Strictness
5. Query Capabilities
What queries will the layer support?
Don't try to build a full-featured layer from scratch. Start with the minimal operations your application needs, test thoroughly, then add features. It's easier to add capabilities than to fix fundamental design mistakes.
One of the most powerful aspects of FoundationDB's architecture is layer composition—the ability to use multiple layers together, even in the same transaction. This enables true multi-model database capabilities without the complexity of polyglot persistence.
Why Multi-Model Matters:
Real applications often have multiple data modeling needs:
Traditionally, this requires multiple specialized databases (MongoDB + Neo4j + TimescaleDB + PostgreSQL + Redis), each with its own:
FoundationDB Multi-Model:
With FoundationDB, all data lives in the same cluster:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147
# ============================================# MULTI-MODEL TRANSACTION EXAMPLE# ============================================ import fdbimport json fdb.api_version(720)db = fdb.open() # Different layers for different data modelsfrom document_layer import DocumentLayerfrom graph_layer import GraphLayerfrom kv_cache import KeyValueCache # Initialize layers (each uses different key prefix)docs = DocumentLayer('documents')graph = GraphLayer('social')cache = KeyValueCache('cache') @fdb.transactionaldef register_user_complete(tr, user_data): """ Register a new user, updating multiple data models atomically. This SINGLE TRANSACTION: - Creates a user document - Adds user to social graph - Sets up caching entries - Creates follow relationships All succeed or all fail. No inconsistent states. """ user_id = user_data['user_id'] # 1. Store user profile as document docs.insert_one(tr, 'users', { '_id': user_id, 'email': user_data['email'], 'name': user_data['name'], 'created_at': datetime.utcnow().isoformat(), 'preferences': user_data.get('preferences', {}), }) # 2. Create node in social graph graph.create_node(tr, user_id, { 'type': 'user', 'name': user_data['name'], }) # 3. Set cache entries for fast lookup cache.set(tr, f'user:{user_id}:email', user_data['email']) cache.set(tr, f'email:{user_data["email"]}:user', user_id) # 4. Create initial follow relationships (e.g., follow official account) graph.create_edge(tr, user_id, 'official_account', 'FOLLOWS') # 5. Create indexes in document layer (handled automatically) # 6. Update counters atomically cache.increment(tr, 'stats:total_users') return user_id @fdb.transactionaldef get_user_feed(tr, user_id, limit=20): """ Get personalized feed using multiple data sources. Combines: - Graph traversal (who does user follow?) - Document queries (posts from followed users) - Cache reads (pre-computed recommendations) """ # 1. From graph: Get who user follows following = graph.get_outgoing(tr, user_id, 'FOLLOWS') followed_ids = [f['node'] for f in following] # 2. From cache: Get any pre-computed recommendations cached_recs = cache.get(tr, f'recommendations:{user_id}') if cached_recs: recommended_ids = json.loads(cached_recs) else: recommended_ids = [] # 3. From documents: Get recent posts from followed users posts = [] for author_id in followed_ids[:10]: # Limit for transaction size author_posts = docs.find(tr, 'posts', filter={'author': author_id}, sort={'created_at': -1}, limit=5 ) posts.extend(author_posts) # 4. Sort combined results by time posts.sort(key=lambda p: p['created_at'], reverse=True) return posts[:limit] @fdb.transactionaldef transfer_with_audit(tr, from_account, to_account, amount): """ Financial transfer with multi-model atomicity. Updates: - Account balances (document model) - Transaction history (time-series model) - Audit graph (graph model showing fund flow) """ # 1. Load accounts (documents) from_doc = docs.find_one(tr, 'accounts', {'_id': from_account}) to_doc = docs.find_one(tr, 'accounts', {'_id': to_account}) if from_doc['balance'] < amount: raise ValueError("Insufficient funds") # 2. Update balances (documents) docs.update_one(tr, 'accounts', {'_id': from_account}, {'$inc': {'balance': -amount}} ) docs.update_one(tr, 'accounts', {'_id': to_account}, {'$inc': {'balance': amount}} ) # 3. Record transaction (time-series event) tx_id = str(uuid.uuid4()) docs.insert_one(tr, 'transactions', { '_id': tx_id, 'from': from_account, 'to': to_account, 'amount': amount, 'timestamp': datetime.utcnow().isoformat(), }) # 4. Update flow graph (for fraud detection, auditing) graph.create_edge(tr, from_account, to_account, 'TRANSFERRED', { 'amount': amount, 'tx_id': tx_id, }) return tx_idThe examples above show operations across documents, graphs, and caches—all in single transactions. If any part fails, everything rolls back. This eliminates the 'what do I do if the graph update fails but the document was already written?' problem that plagues polyglot systems.
The layer architecture is what transforms FoundationDB from an interesting key-value store into a universal database platform. By separating the hard problems (distribution, transactions, durability) from the flexible ones (data models, query languages, schemas), FoundationDB achieves both reliability and adaptability.
What's Next:
We've seen how FoundationDB works—its key-value core, strict serializability, and layer architecture. But who actually uses this in production? In the next page, we'll examine real-world deployments at Apple, Snowflake, and other organizations, understanding what problems they solved with FoundationDB and what lessons their experiences offer.
You now understand FoundationDB's layer architecture—how higher-level data models are built on the key-value foundation, how layers inherit ACID guarantees, and how layer composition enables multi-model databases. Next, we'll see these concepts in action at scale with real-world case studies.