Loading content...
Abstract architectural principles become concrete when examined through specific implementations. ArangoDB serves as an exemplary case study of native multi-model database design—a system built from inception to support document, graph, and key-value access patterns within a unified architecture.
Founded in 2011 and open-sourced in 2012, ArangoDB represents a deliberate attempt to solve the polyglot persistence problem through deep integration rather than bolted-on extensions. Its design choices illuminate both the possibilities and challenges of multi-model databases.
This page examines ArangoDB not as a product endorsement but as a lens through which to understand how multi-model concepts manifest in practice. The patterns we explore apply broadly to evaluating any multi-model system.
By the end of this page, you will understand ArangoDB's architecture, its AQL query language, how it handles documents and graphs, and how to evaluate its approach for your use cases. You'll see multi-model principles instantiated in a real system.
ArangoDB's architecture reflects its multi-model philosophy at every layer:
Core Architectural Principles:
High-Level Architecture:
┌─────────────────────────────────────────────────────────────┐
│ Client Applications │
├─────────────────────────────────────────────────────────────┤
│ HTTP/WebSocket API Layer │
│ (REST API, JavaScript SDK, etc.) │
├─────────────────────────────────────────────────────────────┤
│ AQL Query Engine │
│ ┌─────────────┬──────────────┬────────────────────────────┐ │
│ │ Document │ Graph │ Search/Analytics │ │
│ │ Operations │ Traversals │ Operations │ │
│ └─────────────┴──────────────┴────────────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Query Optimizer │
│ (Cost-based optimization across models) │
├─────────────────────────────────────────────────────────────┤
│ Collection/Graph Management │
├─────────────────────────────────────────────────────────────┤
│ RocksDB Storage Engine │
│ (LSM-tree with column family per collection) │
└─────────────────────────────────────────────────────────────┘
Data Organization:
ArangoDB organizes data into collections (similar to tables or MongoDB collections) and graphs (named sets of vertex and edge collections):
_key, _id, _rev attributes_from and _to attributes// Regular document in 'products' collection
{
"_key": "laptop_123",
"_id": "products/laptop_123",
"_rev": "12345678",
"name": "ProBook X1",
"category": "electronics",
"specs": { "cpu": "Intel i7", "ram": "16GB" }
}
// Edge document in 'purchased' edge collection
{
"_key": "123_456",
"_id": "purchased/123_456",
"_from": "users/alice",
"_to": "products/laptop_123",
"date": "2024-01-15",
"quantity": 1
}
ArangoDB's elegant insight is that graph edges ARE documents. They have all document capabilities (flexible schema, indexes, queries) plus special _from and _to attributes for graph semantics. This unification enables seamless mixing of document and graph operations.
AQL (ArangoDB Query Language) exemplifies unified multi-model query design. Unlike SQL with extensions or separate query languages per model, AQL handles documents, graphs, and analytical operations with consistent syntax.
AQL Core Concepts:
1. FOR Loops — The Foundation
AQL uses FOR loops as its primary iteration construct, analogous to SQL's FROM but more flexible:
// Iterate over documents
FOR product IN products
RETURN product
// Equivalent to: SELECT * FROM products
2. FILTER, SORT, LIMIT — Familiar Operations
FOR product IN products
FILTER product.category == "electronics"
FILTER product.price < 1000
SORT product.price DESC
LIMIT 10
RETURN { name: product.name, price: product.price }
3. Graph Traversals — Native Integration
Graph traversals use the same FOR syntax with traversal specifications:
// Find users who purchased from 'users/alice' (outbound edges)
FOR vertex, edge, path
IN 1..3 // Depth 1 to 3
OUTBOUND 'users/alice' // Starting vertex
GRAPH 'social_network' // Named graph
RETURN { user: vertex, path: path }
4. Cross-Model Queries — Seamless Mixing
The power emerges when combining patterns:
123456789101112131415161718192021222324252627282930
// Find electronics products purchased by high-influence users// Combines: document filter, graph traversal, aggregation FOR product IN products FILTER product.category == "electronics" FILTER product.price > 500 // For each matching product, find who purchased it LET purchasers = ( FOR user, edge IN 1 INBOUND product GRAPH 'purchases' // Calculate each user's social influence LET follower_count = LENGTH( FOR follower IN 1..2 INBOUND user GRAPH 'social' RETURN 1 ) FILTER follower_count > 100 // Only influential users RETURN { user: user.name, followers: follower_count, purchase_date: edge.date } ) FILTER LENGTH(purchasers) > 0 RETURN { product: product.name, price: product.price, influential_buyers: purchasers }AQL Operations Reference:
| Operation | Syntax | Description |
|---|---|---|
| Document iteration | FOR doc IN collection | Iterate over all documents in collection |
| Filter | FILTER condition | Filter iteration results |
| Sort | SORT expr [ASC|DESC] | Order results |
| Limit | LIMIT offset, count | Limit result count |
| Graph traversal | FOR v, e, p IN min..max DIRECTION start GRAPH name | Traverse named graph |
| Edge traversal | FOR v IN 1 OUTBOUND start edges | Traverse specific edge collection |
| Let binding | LET var = expression | Bind subquery or expression to variable |
| Collect/Aggregate | COLLECT key = expr AGGREGATE agg = func | Group and aggregate |
| Insert | INSERT doc INTO collection | Insert document |
| Update | UPDATE key WITH attrs IN collection | Update document |
| Remove | REMOVE key IN collection | Delete document |
AQL deliberately avoids SQL's keyword-heavy syntax for a more programmable, composable style. Subqueries, variable binding, and functional operations compose naturally—essential for complex cross-model queries.
As a document database, ArangoDB provides rich capabilities for JSON document storage and querying.
Document Structure and Keys:
Every document has system attributes:
_key — Unique identifier within collection (user-defined or auto-generated)_id — Globally unique identifier: collection/_key_rev — Revision identifier for optimistic locking{
"_key": "user_alice", // User-specified
"_id": "users/user_alice", // Auto-derived
"_rev": "_abc123xyz", // System-managed
"name": "Alice Smith",
"email": "alice@example.com",
"preferences": { // Nested documents
"theme": "dark",
"notifications": true
},
"roles": ["admin", "developer"] // Arrays
}
CRUD Operations:
12345678910111213141516171819202122232425262728293031323334
// INSERT - Create new documentINSERT { _key: "product_001", name: "Wireless Mouse", price: 29.99, inventory: 150} INTO productsRETURN NEW // UPDATE - Modify existing documentUPDATE "product_001" WITH { price: 24.99, on_sale: true} IN productsRETURN { old: OLD, new: NEW } // REPLACE - Complete replacementREPLACE "product_001" WITH { _key: "product_001", name: "Wireless Mouse Pro", price: 39.99, inventory: 200, features: ["ergonomic", "bluetooth"]} IN products // UPSERT - Insert or updateUPSERT { _key: "product_001" }INSERT { name: "New Product", price: 50 }UPDATE { last_accessed: DATE_NOW() }IN products // REMOVE - Delete documentREMOVE "product_001" IN productsRETURN OLDQuerying Nested Documents:
AQL handles nested structures naturally:
// Query nested attributes
FOR user IN users
FILTER user.preferences.theme == "dark"
FILTER user.address.city IN ["NYC", "LA", "Chicago"]
RETURN user
// Query arrays
FOR user IN users
FILTER "admin" IN user.roles
RETURN user
// Array operations
FOR user IN users
FILTER LENGTH(user.roles) > 2
FILTER user.roles ANY == "developer" // Any element matches
FILTER user.roles ALL != "guest" // All elements match
RETURN user
Indexes for Documents:
ArangoDB supports various index types for document queries:
// Create indexes via API or AQL
db.products.ensureIndex({
type: "persistent", // B-tree index
fields: ["category", "price"],
unique: false
});
db.products.ensureIndex({
type: "fulltext",
fields: ["description"],
minLength: 3
});
db.products.ensureIndex({
type: "geo",
fields: ["location"],
geoJson: true
});
db.products.ensureIndex({
type: "ttl", // Time-to-live
fields: ["expires_at"],
expireAfter: 0 // Delete when expires_at is reached
});
ArangoDB's document model encourages embedding related data when access patterns warrant. However, for data that will participate in graph relationships, use document references (_key, _id) rather than embedding—the graph model provides superior relationship traversal.
ArangoDB's graph capabilities are first-class, not bolted on. The key insight—edges are documents—enables rich graph operations while maintaining document flexibility.
Graph Definition:
Named graphs define which collections hold vertices and edges:
// Create a named graph
var graph = graph_module.graph;
var g = graph._create("social_network",
[
{
collection: "follows", // Edge collection
from: ["users"], // Source vertices
to: ["users"] // Target vertices
},
{
collection: "likes",
from: ["users"],
to: ["posts"]
}
],
["users", "posts"] // Orphan collections
);
Edge Documents:
Edges are documents with required _from and _to attributes:
// Edge in 'follows' collection
{
"_key": "alice_follows_bob",
"_from": "users/alice",
"_to": "users/bob",
"since": "2024-01-01",
"strength": 0.85,
"mutual": false
}
Graph Traversal Patterns:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// Basic outbound traversal: who does Alice follow?FOR followed IN 1 OUTBOUND 'users/alice' GRAPH 'social_network' RETURN followed.name // Inbound traversal: who follows Alice?FOR follower IN 1 INBOUND 'users/alice' GRAPH 'social_network' RETURN follower.name // Any directionFOR connection IN 1 ANY 'users/alice' GRAPH 'social_network' RETURN connection // Variable depth: friends of friends (depth 1-3)FOR friend, edge, path IN 1..3 OUTBOUND 'users/alice' GRAPH 'social_network' RETURN { friend: friend.name, depth: LENGTH(path.edges), connection_path: path.vertices[*].name } // Filter during traversalFOR user, edge IN 1..2 OUTBOUND 'users/alice' GRAPH 'social_network' FILTER edge.strength > 0.5 // Only strong connections FILTER user.active == true // Only active users RETURN user // Shortest pathFOR v, e IN OUTBOUND SHORTEST_PATH 'users/alice' TO 'users/zara' GRAPH 'social_network' RETURN { vertex: v.name, edge_type: e.type } // All shortest paths (if multiple exist)FOR path IN OUTBOUND ALL_SHORTEST_PATHS 'users/alice' TO 'users/zara' GRAPH 'social_network' RETURN path.vertices[*].name // Pattern matching: find triangles (mutual friend groups)FOR user IN users FILTER user._id != 'users/alice' LET mutual = ( FOR m IN 1 OUTBOUND 'users/alice' GRAPH 'social_network' FILTER m._id != user._id FOR check IN 1 OUTBOUND user GRAPH 'social_network' FILTER check._id == m._id RETURN m ) FILTER LENGTH(mutual) > 0 RETURN { user: user.name, mutual_friends: mutual[*].name }Graph-Specific Optimizations:
Edge Indexes:
ArangoDB automatically creates edge indexes on _from and _to for O(1) neighbor lookup:
Edge Index:
_from: users/alice -> [edge_001, edge_002, edge_003]
_to: users/bob -> [edge_001, edge_100]
Vertex-Centric Indexes: For filtering edge properties during traversal:
// Create vertex-centric index
db.follows.ensureIndex({
type: "persistent",
fields: ["strength"],
inBackground: true
});
// Now this traversal uses index for filtering:
// FOR user IN 1 OUTBOUND 'users/alice' GRAPH 'social'
// FILTER edge.strength > 0.8
// RETURN user
Traversal Options:
// Control traversal behavior
FOR vertex, edge, path IN 1..5 OUTBOUND 'users/alice'
GRAPH 'social'
OPTIONS {
bfs: true, // Breadth-first (default: depth-first)
uniqueVertices: 'path', // Don't revisit vertices in same path
uniqueEdges: 'path' // Don't reuse edges in same path
}
RETURN vertex
You can traverse without named graphs by specifying edge collections directly: FOR v IN 1 OUTBOUND start follows, likes. Named graphs provide schema enforcement (valid from/to collections) and semantic grouping but aren't required for traversal.
ArangoDB provides ACID transactions that span collections and models—a key advantage over polyglot persistence.
Transaction Model:
ArangoDB uses single-collection transactions for simple operations and multi-collection transactions for complex cross-model operations.
Single-Collection Transaction (Implicit):
Simple operations are automatically transactional:
// This is automatically atomic
INSERT { name: "Product", price: 100 } INTO products
Multi-Collection Transaction:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
// JavaScript transaction (multi-collection)const db = require('@arangodb').db; // Execute a transactiondb._executeTransaction({ // Collections involved collections: { write: ['orders', 'purchased', 'inventory'], read: ['products', 'users'] }, // Transaction function action: function(params) { const db = require('@arangodb').db; // Get input const userId = params.userId; const productId = params.productId; const quantity = params.quantity; // Read product (consistent within transaction) const product = db.products.document(productId); // Check inventory const inv = db.inventory.firstExample({ productId: productId }); if (inv.available < quantity) { throw new Error('Insufficient inventory'); } // Create order (document operation) const order = db.orders.insert({ userId: userId, productId: productId, quantity: quantity, total: product.price * quantity, status: 'confirmed', createdAt: Date.now() }); // Create edge (graph operation) db.purchased.insert({ _from: 'users/' + userId, _to: 'products/' + productId, orderId: order._key, date: Date.now() }); // Update inventory (document operation) db.inventory.update(inv._key, { available: inv.available - quantity }); return order; }, // Parameters passed to action params: { userId: 'alice', productId: 'laptop_123', quantity: 1 }});AQL-Based Transactions:
For simpler multi-collection operations, AQL provides implicit transaction scope:
// All operations in one AQL query are atomic
LET order = FIRST(
INSERT {
userId: "alice",
productId: "laptop_123",
total: 999.99
} INTO orders
RETURN NEW
)
LET edge = FIRST(
INSERT {
_from: "users/alice",
_to: "products/laptop_123",
orderId: order._key
} INTO purchased
RETURN NEW
)
UPDATE "laptop_123" WITH {
inventory: products.inventory - 1
} IN products
RETURN { order, edge }
Isolation Levels:
ArangoDB supports snapshot isolation through MVCC:
_rev for conflict detection// Optimistic locking example
FOR product IN products
FILTER product._key == "laptop_123"
UPDATE product WITH {
inventory: product.inventory - 1
} IN products
OPTIONS { ignoreRevs: false } // Enable revision checking
RETURN { old: OLD, new: NEW }
Distributed Transactions:
In cluster deployments, ArangoDB coordinates transactions across shards:
Larger transaction scopes (more collections, more operations) increase lock contention and coordination overhead. Design for minimal transaction scope when possible. For high-throughput scenarios, consider event sourcing or saga patterns for cross-aggregate consistency.
Real applications combine ArangoDB's models in specific patterns. Let's examine several production-proven approaches:
Pattern 1: E-Commerce with Social Features
Collections:
├── products (document) - Product catalog
├── users (document) - User profiles
├── orders (document) - Order records
├── reviews (document) - Product reviews
├── purchased (edge) - User → Product
├── viewed (edge) - User → Product
├── similar_to (edge) - Product → Product
└── follows (edge) - User → User
Graphs:
├── purchase_graph: users ─[purchased]→ products
├── social_graph: users ─[follows]→ users
└── product_graph: products ─[similar_to]→ products
Query Example: Personalized Recommendations
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// Find recommendations for Alice based on:// 1. Products similar to what she purchased// 2. Products purchased by people she follows who have similar taste LET alice_purchases = ( FOR product IN 1 OUTBOUND 'users/alice' GRAPH 'purchase_graph' RETURN product._id) // Similar to purchasedLET similar_products = ( FOR purchased_id IN alice_purchases FOR similar IN 1 OUTBOUND purchased_id GRAPH 'product_graph' FILTER similar._id NOT IN alice_purchases COLLECT product = similar WITH COUNT INTO score RETURN { product, score }) // From followed users with similar tasteLET social_recommendations = ( FOR followed IN 1 OUTBOUND 'users/alice' GRAPH 'social_graph' // Find followed users who bought same things LET common_purchases = LENGTH( FOR their_purchase IN 1 OUTBOUND followed GRAPH 'purchase_graph' FILTER their_purchase._id IN alice_purchases RETURN 1 ) FILTER common_purchases > 2 // Similar taste threshold // Get their other purchases FOR their_product IN 1 OUTBOUND followed GRAPH 'purchase_graph' FILTER their_product._id NOT IN alice_purchases COLLECT product = their_product AGGREGATE trust_score = SUM(common_purchases) RETURN { product, trust_score }) // Combine and rankFOR rec IN UNION( (FOR r IN similar_products RETURN { p: r.product, s: r.score * 2 }), (FOR r IN social_recommendations RETURN { p: r.product, s: r.trust_score })) COLLECT product = rec.p AGGREGATE total_score = SUM(rec.s) SORT total_score DESC LIMIT 10 RETURN { product: product.name, category: product.category, price: product.price, score: total_score }Pattern 2: Identity and Access Management
Collections:
├── users (document) - User accounts
├── groups (document) - Security groups
├── roles (document) - Role definitions
├── resources (document) - Protected resources
├── member_of (edge) - User → Group
├── has_role (edge) - User/Group → Role
└── can_access (edge) - Role → Resource
Graph:
└── iam_graph: Connects all IAM relationships
Query: Check Access Permissions
// Can user access resource? Check all paths
LET user_id = 'users/alice'
LET resource_id = 'resources/sensitive_doc'
// Direct role assignment
LET direct_access = FIRST(
FOR role IN 1 OUTBOUND user_id GRAPH 'iam_graph'
OPTIONS { edgeCollections: ['has_role'] }
FOR resource IN 1 OUTBOUND role GRAPH 'iam_graph'
OPTIONS { edgeCollections: ['can_access'] }
FILTER resource._id == resource_id
RETURN true
)
// Via group membership
LET group_access = FIRST(
FOR group IN 1..3 OUTBOUND user_id GRAPH 'iam_graph'
OPTIONS { edgeCollections: ['member_of'] }
FOR role IN 1 OUTBOUND group GRAPH 'iam_graph'
OPTIONS { edgeCollections: ['has_role'] }
FOR resource IN 1 OUTBOUND role GRAPH 'iam_graph'
OPTIONS { edgeCollections: ['can_access'] }
FILTER resource._id == resource_id
RETURN true
)
RETURN direct_access == true OR group_access == true
Pattern 3: Content Management with Versioning
Store content as documents, relationships as edges, version history within documents:
// Content document with embedded version history
{
"_key": "article_001",
"title": "Multi-Model Databases",
"current_version": 3,
"content": "...",
"versions": [
{ "v": 1, "content": "...", "date": "..." },
{ "v": 2, "content": "...", "date": "..." }
],
"author": "users/alice"
}
// Relationship edges
{ "_from": "articles/article_001", "_to": "tags/database" }
{ "_from": "articles/article_001", "_to": "categories/tech" }
Use documents for self-contained entities, edges for relationships that benefit from traversal, and embedded data for version history or tightly-coupled sub-entities. The right choice depends on access patterns—ask 'how will this data be queried?'
ArangoDB illustrates how multi-model concepts manifest in a production system. Let's consolidate the key learnings:
What's Next:
Having examined ArangoDB as a concrete implementation, the next page explores the flexibility benefits of multi-model databases—the specific advantages organizations gain from consolidating on a unified multi-model system.
You now understand how a native multi-model database like ArangoDB implements multi-model concepts. This practical knowledge enables you to evaluate multi-model databases against your specific requirements.