Loading learning content...
Azure Blob Storage, part of Azure Storage accounts, represents Microsoft's approach to object storage. Launched as part of the broader Azure platform in 2010, Blob Storage reflects Microsoft's enterprise heritage while embracing cloud-native patterns. It powers not only external customer workloads but also internal Microsoft services including Microsoft 365, Xbox, and GitHub.
Blob Storage's architecture differs meaningfully from S3 and GCS. Rather than standalone buckets, Azure uses a hierarchical model: Storage Accounts contain Containers, which contain Blobs. This additional layer provides unique flexibility for organization, security, and billing that proves particularly valuable in enterprise environments with complex organizational structures.
By the end of this page, you'll understand Azure Blob Storage's architecture, the Storage Account model, access tiers, unique features like hierarchical namespaces (ADLS Gen2), and how to choose between Azure Blob, S3, and GCS for different scenarios.
Azure's storage hierarchy differs fundamentally from S3 and GCS. Understanding this model is essential for designing Azure storage solutions.
Storage Accounts: The Foundation
A storage account is the top-level resource that provides:
This is conceptually different from S3/GCS where each bucket is largely independent. In Azure, the storage account creates a management and security boundary around related storage resources.
Storage Account Types
Azure offers several storage account types:
Standard general-purpose v2: The default, supporting Blobs, Files, Queues, Tables. Uses HDD-backed storage. Most cost-effective for typical workloads.
Premium block blobs: SSD-backed storage optimized for block blob workloads requiring low latency and high transaction rates.
Premium file shares: SSD-backed Azure Files for high-performance file shares.
Premium page blobs: SSD-backed storage for page blobs, primarily used for Azure VM disks.
Containers: The Second Layer
Within a storage account, containers organize blobs:
1234567891011121314151617181920212223
Azure Subscription └── Resource Group └── Storage Account (mystorageaccount) │ ├── Blob Service (mystorageaccount.blob.core.windows.net) │ ├── Container: images │ │ ├── photo1.jpg (Block Blob) │ │ ├── photo2.jpg (Block Blob) │ │ └── photos/vacation/ (Virtual directory) │ │ └── beach.jpg (Block Blob) │ │ │ └── Container: videos │ ├── intro.mp4 (Block Blob) │ └── vhd/disk.vhd (Page Blob) │ ├── File Service (mystorageaccount.file.core.windows.net) │ └── File Share: documents │ ├── Queue Service (mystorageaccount.queue.core.windows.net) │ └── Queue: tasks │ └── Table Service (mystorageaccount.table.core.windows.net) └── Table: logsStorage account names must be globally unique across all of Azure (3-24 lowercase alphanumeric characters). Container names only need to be unique within their storage account. This is more flexible than S3/GCS where bucket names are globally unique.
Azure distinguishes between three blob types, each optimized for different access patterns. This is more explicit than S3/GCS, which treat all objects uniformly.
Block Blobs
The most common blob type, optimized for streaming and cloud storage:
Block blobs support staged uploads similar to S3 multipart:
1. Put Block (upload individual blocks with IDs)
2. Put Block List (commit blocks in desired order)
This enables:
Append Blobs
Optimized for append-only scenarios:
Append blobs eliminate the need to manage staging files for log aggregation. Multiple writers can append concurrently (with appropriate concurrency controls).
Page Blobs
Optimized for random read/write operations:
Page blobs are conceptually different from object storage—they're more like block device storage exposed through the blob API.
| Blob Type | Max Size | Composition | Best For | S3/GCS Equivalent |
|---|---|---|---|---|
| Block Blob | 190.7 TB | Blocks (up to 4GB each) | General object storage | Standard objects |
| Append Blob | 195 GB | Append blocks only | Logging, streaming writes | No direct equivalent |
| Page Blob | 8 TB | 512-byte pages | VM disks, random access | No direct equivalent |
For block blobs, larger block sizes (up to 4GB) reduce the number of Put Block operations and can improve upload throughput for large files. The Azure SDKs automatically choose optimal block sizes, but for custom implementations, consider: small files = single block; large files = parallel blocks sized based on network bandwidth.
Azure provides more granular redundancy options than S3 or GCS, allowing precise tradeoffs between durability, availability, and cost.
Locally Redundant Storage (LRS)
The simplest option:
Zone-Redundant Storage (ZRS)
Data spread across availability zones:
Geo-Redundant Storage (GRS)
Cross-region replication:
Geo-Zone-Redundant Storage (GZRS)
The most resilient option:
| Option | Copies | Cross-Zone | Cross-Region | Durability | Read from Secondary |
|---|---|---|---|---|---|
| LRS | 3 | No | No | 11 nines | N/A |
| ZRS | 3 | Yes | No | 12 nines | N/A |
| GRS | 6 | No (primary) | Yes | 16 nines | No (RA-GRS: Yes) |
| GZRS | 6 | Yes (primary) | Yes | 16 nines | No (RA-GZRS: Yes) |
Read-Access Variants (RA-GRS, RA-GZRS)
Standard GRS/GZRS only allow reading from the secondary during a failover. The Read-Access variants (RA-GRS, RA-GZRS) allow reading from the secondary region at any time:
mystorageaccount-secondary.blob.core.windows.netComparison with S3/GCS
GRS/GZRS failover is a significant operation: the secondary becomes the new primary, DNS is updated, and the old primary (if recoverable) becomes the secondary. There may be data loss up to the last synchronization point (RPO). Test your failover procedures before you need them.
Azure Blob Storage offers access tiers that control storage cost and access pricing. These can be set at the blob level (unlike S3 where it's object-level storage class) or as a default for the storage account.
Hot Tier
Optimized for data accessed frequently:
Cool Tier
For infrequently accessed data:
Cold Tier (Introduced 2023)
For rarely accessed data:
Archive Tier
For long-term archival:
Key Difference from GCS
Unlike GCS where Archive has identical read performance to Standard, Azure Archive is offline storage. You cannot read an archived blob directly—it must be rehydrated to Hot or Cool first. This is similar to S3 Glacier.
| Tier | Storage Cost | Access Cost | Retrieval Time | Min Duration |
|---|---|---|---|---|
| Hot | Highest | Lowest | Immediate | None |
| Cool | ~50% of Hot | Higher | Immediate | 30 days |
| Cold | ~25% of Hot | Higher | Immediate | 90 days |
| Archive | ~10% of Hot | Highest | Rehydration: 1-15 hours | 180 days |
Lifecycle Management
Azure supports blob lifecycle management policies to automatically tier data:
{
"rules": [
{
"name": "archiveOldData",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["logs/"]
},
"actions": {
"baseBlob": {
"tierToCool": {"daysAfterModificationGreaterThan": 30},
"tierToCold": {"daysAfterModificationGreaterThan": 90},
"tierToArchive": {"daysAfterModificationGreaterThan": 180},
"delete": {"daysAfterModificationGreaterThan": 365}
}
}
}
}
]
}
Lifecycle policies support transitions based on modification date, access date (with access tracking enabled), version age, and more.
Enable 'last access time tracking' on your storage account to create lifecycle policies based on actual access patterns rather than modification dates. This is essential for tiering based on real usage but incurs additional cost for tracking.
Azure Data Lake Storage Gen2 (ADLS Gen2) is a unique capability that transforms Blob Storage for big data analytics. It's implemented as an optional feature called Hierarchical Namespace (HNS) that can be enabled on storage accounts.
What Is Hierarchical Namespace?
Standard blob storage (like S3 and GCS) has a flat namespace—directories are an illusion created by key prefixes. Operations like 'rename folder' actually require copying and deleting every object in that prefix.
With HNS enabled, Azure Blob Storage gains a true hierarchical file system:
Why This Matters for Analytics
ADLS Gen2 API and Access
ADLS Gen2 supports multiple access patterns:
This dual-API approach lets you use blob storage patterns for applications while using file system patterns for analytics—on the same data.
When to Enable HNS
| Workload Type | Enable HNS? | Reasoning |
|---|---|---|
| Big data analytics (Spark, Databricks) | Yes | Significant performance improvement for typical ETL patterns |
| General object storage | Usually No | Slight overhead; blob API features may have delays in HNS-enabled accounts |
| Hybrid analytics + object storage | Evaluate | HNS accounts support blob API, so hybrid is possible but with tradeoffs |
| Compliance requiring immutability | Depends | Some immutability features vary between HNS and non-HNS accounts |
Once enabled, Hierarchical Namespace cannot be disabled on a storage account. This is a permanent decision. Create a test account to validate your workloads before enabling HNS on production storage.
Azure Blob Storage provides multiple layers of access control, reflecting enterprise security requirements.
Storage Account Keys
Every storage account has two access keys that provide full administrative access:
Shared Access Signatures (SAS)
SAS tokens provide limited, time-bound access:
SAS tokens can specify:
123456789
https://mystorageaccount.blob.core.windows.net/images/photo.jpg? sv=2021-06-08 # Storage version &ss=b # Blob service &srt=o # Object level &sp=r # Read permission &se=2024-12-31T23:59:59Z # Expiry time &st=2024-01-01T00:00:00Z # Start time &spr=https # Protocol (HTTPS only) &sig=<signature> # HMAC signatureAzure AD Authentication (RBAC)
Azure AD integration provides identity-based access:
Azure AD is the recommended access pattern for applications running in Azure.
ACLs for ADLS Gen2
With Hierarchical Namespace enabled, POSIX-style ACLs provide granular control:
Container-Level Access
Containers can be configured for public access (not recommended) or private:
Public access can be completely disabled at the storage account level, preventing accidental exposure.
Stored access policies let you manage SAS tokens server-side. Create a policy on a container, then generate SAS tokens that reference that policy. If you need to revoke access, modify or delete the policy—all tokens using it are immediately invalidated. This solves the 'leaked SAS token' problem.
Let's synthesize the architectural differences across all three major cloud object storage services.
Data Model Comparison
| Feature | AWS S3 | Google Cloud Storage | Azure Blob Storage |
|---|---|---|---|
| Hierarchy | Flat (bucket → objects) | Flat (bucket → objects) | Three-level (account → container → blob) |
| Bucket Naming | Global unique | Global unique | Account global; containers local |
| Object Types | Single type | Single type | Block, Append, Page blobs |
| Max Object Size | 5 TB | 5 TB | 190.7 TB (block blob) |
| Hierarchical FS | No (prefix simulation) | No (prefix simulation) | Yes (HNS/ADLS Gen2) |
| Consistency | Strong (since 2020) | Strong (always) | Strong |
| Archive Retrieval | Hours (Glacier) | Immediate | Hours (rehydration) |
Redundancy Comparison
| Requirement | AWS S3 | GCS | Azure |
|---|---|---|---|
| Single zone | S3 One Zone-IA | Regional | LRS |
| Multi-zone (region) | S3 Standard | Regional | ZRS |
| Multi-region (manual) | Cross-Region Replication | — | — |
| Multi-region (automatic) | — | Multi-Region/Dual-Region | GRS/GZRS |
| Read from secondary | Read replica buckets | Multi-region read | RA-GRS/RA-GZRS |
When to Choose Each Provider
Choose AWS S3 when:
Choose Google Cloud Storage when:
Choose Azure Blob Storage when:
For multi-cloud deployments, consider abstraction layers: MinIO provides S3-compatible object storage anywhere; rclone synchronizes across providers; cloud-agnostic SDKs like Apache libcloud abstract provider differences. The key is designing for portability from the start.
Let's consolidate the key insights about Azure Blob Storage:
Architectural Patterns for System Design
When designing systems with Azure Blob Storage:
What's Next:
The next page explores storage classes and tiering in depth across all providers, examining the cost optimization strategies that can dramatically reduce storage spend at scale.
You now understand Azure Blob Storage's architecture, its unique features like hierarchical namespace and append blobs, and how it compares to S3 and GCS. You can make informed decisions about when Azure is the right choice for storage workloads.