System Design HLDDistributed File Systems

Distributed File Systems

LevelAdvanced

Duration90 mins

TopicDistributed File Systems

4 / 5

MinIO: Self-hosted S3

S3 Without the Cloud

Amazon S3 has become the de facto standard API for object storage. Countless applications, frameworks, and tools are built assuming S3 compatibility. But what if you need to run object storage on-premises? What if regulatory requirements prevent cloud usage? What if you simply want to avoid cloud vendor lock-in?

MinIO answers these questions with radical simplicity. Instead of building a complex distributed system with multiple component types (like Ceph), MinIO focuses on doing one thing exceptionally well: providing high-performance S3-compatible object storage that's easy to deploy and operate.

Launched in 2014 and written in Go, MinIO has become the most popular self-hosted S3-compatible object storage, with over 44,000 GitHub stars and adoption by companies like Adobe, PayPal, and Siemens. Its appeal lies in its simplicity: a single binary that can run standalone on a laptop for development or distributed across hundreds of nodes for enterprise scale.

What You Will Learn

By the end of this page, you will understand MinIO's architecture, its erasure coding implementation, how data is distributed across drives and nodes, multi-site replication, bucket versioning, and the operational characteristics that make MinIO the go-to choice for self-hosted object storage.

MinIO Philosophy: Simplicity and Performance

MinIO was built with a contrarian philosophy: distributed storage doesn't need to be complicated. While systems like Ceph and HDFS introduce multiple specialized components, MinIO takes a minimalist approach.

MinIO Design Principles

•Single binary simplicity — MinIO is one Go binary with no external dependencies. No JVM, no additional databases, no complex dependencies.
•Cloud-native from day one — Built for containers and Kubernetes, with first-class operator support.
•S3 compatibility is non-negotiable — 100% S3 API compatibility is the goal, not 'S3-like' or 'mostly compatible'.
•Performance as a feature — Designed for modern NVMe drives and high-speed networks. Benchmarks show 325+ GiB/s read throughput.
•Horizontal scalability — Add nodes to add capacity and throughput. No vertical scaling limits.
•No metadata database — Objects stored as files with metadata inline. No separate metadata store to manage.

The key insight: MinIO recognized that most distributed storage complexity exists to handle edge cases and provide features that many workloads don't need. By focusing on object storage (not file or block) and accepting certain constraints (immutable objects, erasure coding only), MinIO achieves remarkable simplicity.

Comparison of Architecture Complexity:

Distributed Storage Complexity
System	Components	Dependencies
MinIO	MinIO server (single binary)	None (Go static binary)
Ceph	MON, MGR, OSD, MDS, RGW	Multiple daemons, etcd-like consensus
HDFS	NameNode, DataNode, ZKFailoverController	JVM, ZooKeeper for HA
GlusterFS	glusterd, glusterfsd, geo-replication	Multiple daemons per node

When Simplicity Matters

MinIO's simplicity translates directly to operational benefits: faster deployment, easier troubleshooting, lower training requirements, and fewer moving parts that can fail. For teams without dedicated storage engineers, this simplicity can be the deciding factor.

Architecture Overview

MinIO's architecture is elegantly simple compared to other distributed storage systems. A MinIO deployment consists of MinIO server processes, each managing local drives, with all nodes being equal peers (no master/slave distinction).

Converting Mermaid diagram...

Key Architectural Concepts:

MinIO Architecture Terms
Term	Description
Server	A MinIO process managing one or more drives
Drive	A mounted disk path (XFS-formatted recommended)
Server Pool	A group of servers forming an erasure set for parity calculation
Erasure Set	A subset of drives across which one object's data/parity shards are distributed
Bucket	Logical container for objects (S3 concept)
Object	A file stored in MinIO, accessed via S3 API

Deployment Modes:

Standalone (single-node single-drive) — For development and testing. No redundancy.
Standalone erasure code (single-node multi-drive) — One node with 4+ drives. Erasure coding within the node.
Distributed (multi-node) — Multiple nodes, each with multiple drives. Erasure coding spans nodes for node-level fault tolerance.

# Single node, 4 drives
minio server /data1 /data2 /data3 /data4

# Distributed, 4 nodes × 4 drives each
minio server http://node{1...4}/data{1...4}

The distributed syntax ({1...4}) is MinIO-specific shorthand for specifying multiple nodes and drives.

No Master Node

Every MinIO node is a peer—there's no designated master. Any node can serve any request. This eliminates single points of failure and simplifies operations. Use a load balancer for distributing client traffic, but any node can handle any operation.

Erasure Coding: MinIO's Durability Foundation

MinIO uses erasure coding (not replication) as its primary durability mechanism. This provides space efficiency comparable to RAID while tolerating multiple drive failures.

How Erasure Coding Works in MinIO:

When an object is written, MinIO divides it into data shards and computes parity shards using Reed-Solomon encoding:

Data shards — Original data split into pieces
Parity shards — Computed redundancy that allows reconstruction
Erasure set — The group of drives across which shards are distributed

erasure-coding-example.txt
Object: photo.jpg (8 MB)
 
Erasure Set: 8 drives (default configuration)
Data Shards: 4 (50% of drives)
Parity Shards: 4 (50% of drives)
 
Distribution:
  Drive 1: photo.jpg.part.1 (data shard 1, 2MB)
  Drive 2: photo.jpg.part.2 (data shard 2, 2MB)
  Drive 3: photo.jpg.part.3 (data shard 3, 2MB)
  Drive 4: photo.jpg.part.4 (data shard 4, 2MB)
  Drive 5: photo.jpg.part.5 (parity shard 1, 2MB)
  Drive 6: photo.jpg.part.6 (parity shard 2, 2MB)
  Drive 7: photo.jpg.part.7 (parity shard 3, 2MB)
  Drive 8: photo.jpg.part.8 (parity shard 4, 2MB)
 
Storage used: 16 MB total (2x raw size)
Can tolerate: loss of any 4 drives
 
Reading: Need any 4 shards (data or parity) to reconstruct

Erasure Set Sizing:

The number of drives per erasure set determines fault tolerance and storage efficiency:

Drives	Data	Parity	Efficiency	Fault Tolerance
4	2	2	50%	2 drives
8	4	4	50%	4 drives
16	8	8	50%	8 drives
16	12	4	75%	4 drives

MinIO automatically calculates erasure set size based on total drives. For example, 16 drives across 4 nodes creates one erasure set of 16 drives.

Inline vs. Multi-Part:

Small objects (default <128KB) are written inline—all shards in a single write. Larger objects use multi-part upload where each part is independently striped.

Erasure Coding Constraints

You cannot change erasure set configuration after deployment. Adding drives requires adding complete expansion pools, not individual drives. Plan your initial configuration carefully—mistakes require data migration to fix.

Object Storage Model and On-Disk Format

MinIO stores objects in a straightforward directory structure on each drive. Understanding this format helps with troubleshooting and recovery.

On-Disk Layout:

minio-disk-layout.txt
/data1/                              # Drive mount point
├── .minio.sys/                      # System metadata
│   ├── config/                      # Cluster configuration
│   ├── buckets/                     # Bucket metadata
│   └── pool.bin                     # Pool layout
├── bucket-name/                     # User bucket
│   ├── object-name/                 # Object directory
│   │   ├── xl.meta                  # Object metadata (JSON)
│   │   └── part.1                   # Data/parity shard
│   └── prefix/
│       └── nested-object/
│           ├── xl.meta
│           └── part.1
└── another-bucket/
    └── ...

xl.meta File:

The xl.meta file contains object metadata in JSON format:

{
  "version": "1.0.0",
  "checksum": {
    "algorithm": "highwayhash256S",
    "sum": "..."
  },
  "erasure": {
    "algorithm": "ReedSolomon",
    "data": 4,
    "parity": 4
  },
  "meta": {
    "content-type": "image/jpeg",
    "x-amz-meta-custom": "value"
  },
  "parts": [...]
}

Key Points:

No separate metadata store — Metadata lives with data on each drive
Self-describing — Each shard knows its position and the overall encoding scheme
Human-readable — You can inspect metadata with standard tools
Checksums everywhere — Both metadata and data are checksummed for integrity

Disaster Recovery Advantage

Because objects are stored as regular files with self-describing metadata, disaster recovery is straightforward. If MinIO won't start, you can still browse the data directory and recover files manually. Compare this to Ceph's OSD format, which requires Ceph tools to access.

S3 API Compatibility

MinIO's primary interface is the S3 API. Compatibility is extensive—most S3 SDKs and tools work without modification.

S3 API Feature Support
Category	Feature	Support
Basic Operations	GetObject, PutObject, DeleteObject	✓ Full
Bucket Operations	Create, Delete, List, Policy	✓ Full
Multipart Upload	Initiate, Upload, Complete, Abort	✓ Full
Versioning	Bucket versioning, version listing	✓ Full
Object Lock	Governance/Compliance mode, retention	✓ Full
Lifecycle	Expiration, transition rules	✓ Full
Server-Side Encryption	SSE-S3, SSE-KMS, SSE-C	✓ Full
Notifications	Bucket events to Kafka, AMQP, Webhook	✓ Full
Select API	S3 Select for CSV/JSON/Parquet	✓ Full
Access Control	Bucket policies, IAM policies	✓ Full
Presigned URLs	Temporary access URLs	✓ Full

Using Standard S3 Tools:

# AWS CLI with MinIO
aws --endpoint-url http://minio:9000 s3 ls
aws --endpoint-url http://minio:9000 s3 cp file.txt s3://bucket/

# Python boto3
import boto3
s3 = boto3.client('s3',
    endpoint_url='http://minio:9000',
    aws_access_key_id='minioadmin',
    aws_secret_access_key='minioadmin'
)
s3.upload_file('file.txt', 'bucket', 'file.txt')

# mc (MinIO Client - specialized tool)
mc alias set myminio http://minio:9000 minioadmin minioadmin
mc cp file.txt myminio/bucket/

MinIO Client (mc):

While standard S3 tools work, MinIO provides mc—a feature-rich CLI for advanced operations:

minio-client-examples.sh
# Mirror local directory to MinIO
mc mirror /local/data myminio/bucket
 
# Watch bucket for changes
mc watch myminio/bucket
 
# Find large objects
mc find myminio/bucket --larger 100MB
 
# Server-side copy between buckets
mc cp --recursive myminio/source/ myminio/dest/
 
# Diff between local and remote
mc diff /local/data myminio/bucket
 
# Admin operations
mc admin info myminio
mc admin heal myminio
mc admin update myminio

Compatibility Nuances

While MinIO targets 100% S3 compatibility, some edge cases differ. Pre-signed URLs may have different default expiration. Some newer S3 features may lag. Always test your specific use case, especially if migrating from AWS S3.

Versioning and Object Lock

MinIO provides full S3-compatible versioning and object lock capabilities, essential for regulatory compliance and data protection.

Bucket Versioning:

When enabled, every write creates a new version instead of overwriting:

versioning-example.txt
# Enable versioning
mc version enable myminio/bucket
 
# Upload creates version
mc cp file.txt myminio/bucket/  # Creates version V1
 
# Modify and re-upload
echo "modified" >> file.txt
mc cp file.txt myminio/bucket/  # Creates version V2
 
# Original still accessible
mc ls --versions myminio/bucket/file.txt
# file.txt (v2) 2024-01-15 10:30:00
# file.txt (v1) 2024-01-15 10:00:00
 
mc cp myminio/bucket/file.txt?versionId=v1 restored.txt

Object Lock (WORM):

Object Lock prevents objects from being deleted or overwritten, even by administrators:

Mode	Description	Can Override
Governance	Protected but can be bypassed with special permission	Yes, with `s3:BypassGovernanceRetention`
Compliance	Cannot be bypassed by anyone, including root	No, wait for retention to expire

Use Cases:

Compliance mode: Regulated industries (SEC 17a-4, HIPAA) requiring immutable records
Governance mode: Protection against accidental deletion with administrative override
Legal hold: Indefinite hold regardless of retention period

object-lock-example.sh
# Create bucket with object lock enabled
mc mb --with-lock myminio/secure-bucket
 
# Set default retention (30 days governance)
mc retention set --default GOVERNANCE 30d myminio/secure-bucket
 
# Upload with specific retention
mc cp file.txt myminio/secure-bucket/ \
  --retention-mode compliance \
  --retention-until 2025-01-01
 
# Apply legal hold
mc legalhold set myminio/secure-bucket/file.txt
 
# Attempting delete fails
mc rm myminio/secure-bucket/file.txt
# ERROR: Object is locked

Compliance Mode Is Irreversible

Once an object is in Compliance mode, NOTHING can delete it until retention expires—not even MinIO administrators or deleting the server. Test thoroughly in development before applying in production. Misconfigured retention periods are extremely costly.

Multi-Site Replication

MinIO supports bucket-level replication between clusters for disaster recovery, geo-distribution, and data synchronization.

Replication Modes:

MinIO Replication Options
Mode	Description	Use Case
Server-Side Replication	Automatic, near-real-time replication	DR, multi-datacenter
Active-Active	Bidirectional replication between sites	Global active-active
Active-Passive	One-way replication to standby site	Traditional DR
Batch Replication	Scheduled bulk sync	Initial seeding, catch-up

Converting Mermaid diagram...

Configuring Bucket Replication:

# Set up remote target
mc admin bucket remote add myminio/source-bucket 
  http://accesskey:secretkey@remote-minio:9000/target-bucket 
  --service replication

# Enable replication
mc replicate add myminio/source-bucket 
  --remote-bucket target-bucket 
  --replicate "delete,delete-marker,existing-objects"

# Check replication status
mc replicate status myminio/source-bucket

What Gets Replicated:

Object data and metadata
Object versions (with versioning enabled)
Object tags
Object lock / retention settings
Delete markers (optional)
Existing objects (optional, requires batch replication)

Replication Lag Monitoring

MinIO exposes replication metrics via Prometheus. Monitor minio_cluster_replication_pending_bytes and minio_cluster_replication_failed_operations to catch replication issues before they cause data loss during site failures.

Performance Characteristics

MinIO is designed for high performance on modern hardware. Understanding its performance characteristics helps optimize deployments.

Benchmark Results (MinIO published):

Metric	Performance
GET (read)	325 GiB/s aggregate
PUT (write)	165 GiB/s aggregate
Single object	10+ GB/s for large objects
Small objects	100k+ objects/second

Results from 32-node cluster with NVMe drives and 100GbE networking

Performance Factors:

Key Performance Considerations

•Drive type — NVMe >> SSD >> HDD. Erasure coding benefits greatly from fast random I/O.
•Network bandwidth — 25GbE or higher for NVMe clusters. Erasure read/write spans network.
•CPU — Modern multi-core for erasure encoding. Go's concurrency leverages cores well.
•Erasure set size — Larger erasure sets = more parallel I/O but higher network amplification.
•Object size — Large objects (>10MB) stream efficiently; small objects have overhead per object.
•Concurrent clients — MinIO scales with concurrency; benchmark with realistic client counts.

Small Object Optimization:

Small objects (<128KB by default) are handled specially:

Written inline to reduce I/O operations
Metadata and data in single file
Less overhead from erasure coding per object

Performance Tuning:

# Increase GOMAXPROCS for more CPU utilization
export GOMAXPROCS=
minio server ...

# Use multiple drives with high queue depth
minio server /mnt/disk{1...12}

# Enable drive caching for metadata-heavy workloads
minio server --config-dir=/config --certs-dir=/certs /data{1...4}

Realistic Performance Expectations

Published benchmarks use optimal conditions (NVMe, 100GbE, large objects). Real-world performance depends on your hardware, object size distribution, and access patterns. Always benchmark with your actual workload before sizing production deployments.

Kubernetes and Cloud-Native Deployment

MinIO is built for cloud-native environments, with first-class Kubernetes support through the MinIO Operator.

MinIO Operator:

The operator manages MinIO tenants (clusters) on Kubernetes:

apiVersion: minio.min.io/v2
kind: Tenant
metadata:
  name: minio-tenant
  namespace: minio
spec:
  image: minio/minio:latest
  pools:
    - servers: 4
      volumesPerServer: 4
      volumeClaimTemplate:
        spec:
          storageClassName: local-nvme
          resources:
            requests:
              storage: 1Ti
  requestAutoCert: true
  s3:
    bucketDNS: true

Key Kubernetes Features:

MinIO Kubernetes Integration
Feature	Description
Operator	Manages tenant lifecycle, upgrades, configuration
Console	Web UI for administration, metrics visualization
CSI Driver	Use MinIO as PVC backend for other workloads
Auto-TLS	Automatic certificate generation and rotation
Prometheus Integration	Native metrics export for monitoring
Horizontal Scaling	Add pools via tenant spec updates
StatefulSet Per Pool	Predictable pod naming and storage binding

minio-helm-install.sh
# Install MinIO Operator
helm repo add minio-operator https://operator.min.io
helm install minio-operator minio-operator/operator
 
# Create tenant
kubectl apply -f tenant.yaml
 
# Access MinIO Console
kubectl port-forward svc/minio-tenant-console 9443:9443
 
# Get credentials
kubectl get secret minio-tenant-user-1 -o jsonpath='{.data.CONSOLE_ACCESS_KEY}' | base64 -d

Direct-Attached Storage Recommended

For best performance, use local NVMe drives (local persistent volumes) rather than network storage. MinIO provides its own redundancy through erasure coding—you don't need the storage layer to be redundant too. Replicating already-erasure-coded data through a SAN is wasteful.

Summary: MinIO Self-Hosted S3

We've explored MinIO's approach to self-hosted object storage—achieving S3 compatibility and high performance through radical simplicity. Let's consolidate the key concepts:

Key Takeaways

•Single binary simplicity — No external dependencies, easy deployment, straightforward operations.
•Erasure coding for durability — Space-efficient redundancy tolerating multiple drive failures.
•Full S3 API compatibility — Standard tools and SDKs work without modification.
•Peer-to-peer architecture — No master node, any node serves any request.
•Self-describing format — Objects stored as files with inline metadata, enabling manual recovery.
•Versioning and object lock — Enterprise data protection and compliance features.
•Multi-site replication — DR and geo-distribution capabilities.
•Cloud-native first — Kubernetes operator for production-grade deployments.

What's Next:

We've now explored four major distributed storage systems: HDFS, Ceph, GlusterFS, and MinIO. Each has distinct strengths and ideal use cases. In the final page of this module, we'll synthesize this knowledge with a framework for choosing distributed storage—helping you select the right system for your specific requirements.

Page Complete

You now understand MinIO's architecture at a level sufficient for designing and deploying self-hosted object storage solutions. You can explain erasure coding trade-offs, S3 compatibility, and operational characteristics.

4 / 5

Loading learning content...

System Design HLDDistributed File Systems

Distributed File Systems

LevelAdvanced

Duration90 mins

TopicDistributed File Systems

4 / 5

MinIO: Self-hosted S3

S3 Without the Cloud

What You Will Learn

MinIO Philosophy: Simplicity and Performance

MinIO Design Principles

•Single binary simplicity — MinIO is one Go binary with no external dependencies. No JVM, no additional databases, no complex dependencies.
•Cloud-native from day one — Built for containers and Kubernetes, with first-class operator support.
•S3 compatibility is non-negotiable — 100% S3 API compatibility is the goal, not 'S3-like' or 'mostly compatible'.
•Performance as a feature — Designed for modern NVMe drives and high-speed networks. Benchmarks show 325+ GiB/s read throughput.
•Horizontal scalability — Add nodes to add capacity and throughput. No vertical scaling limits.
•No metadata database — Objects stored as files with metadata inline. No separate metadata store to manage.

Comparison of Architecture Complexity:

Distributed Storage Complexity
System	Components	Dependencies
MinIO	MinIO server (single binary)	None (Go static binary)
Ceph	MON, MGR, OSD, MDS, RGW	Multiple daemons, etcd-like consensus
HDFS	NameNode, DataNode, ZKFailoverController	JVM, ZooKeeper for HA
GlusterFS	glusterd, glusterfsd, geo-replication	Multiple daemons per node

When Simplicity Matters

Architecture Overview

Converting Mermaid diagram...

Key Architectural Concepts:

MinIO Architecture Terms
Term	Description
Server	A MinIO process managing one or more drives
Drive	A mounted disk path (XFS-formatted recommended)
Server Pool	A group of servers forming an erasure set for parity calculation
Erasure Set	A subset of drives across which one object's data/parity shards are distributed
Bucket	Logical container for objects (S3 concept)
Object	A file stored in MinIO, accessed via S3 API

Deployment Modes:

Standalone (single-node single-drive) — For development and testing. No redundancy.
Standalone erasure code (single-node multi-drive) — One node with 4+ drives. Erasure coding within the node.
Distributed (multi-node) — Multiple nodes, each with multiple drives. Erasure coding spans nodes for node-level fault tolerance.

# Single node, 4 drives
minio server /data1 /data2 /data3 /data4

# Distributed, 4 nodes × 4 drives each
minio server http://node{1...4}/data{1...4}

The distributed syntax ({1...4}) is MinIO-specific shorthand for specifying multiple nodes and drives.

No Master Node

Erasure Coding: MinIO's Durability Foundation

MinIO uses erasure coding (not replication) as its primary durability mechanism. This provides space efficiency comparable to RAID while tolerating multiple drive failures.

How Erasure Coding Works in MinIO:

When an object is written, MinIO divides it into data shards and computes parity shards using Reed-Solomon encoding:

Data shards — Original data split into pieces
Parity shards — Computed redundancy that allows reconstruction
Erasure set — The group of drives across which shards are distributed

erasure-coding-example.txt
Object: photo.jpg (8 MB)
 
Erasure Set: 8 drives (default configuration)
Data Shards: 4 (50% of drives)
Parity Shards: 4 (50% of drives)
 
Distribution:
  Drive 1: photo.jpg.part.1 (data shard 1, 2MB)
  Drive 2: photo.jpg.part.2 (data shard 2, 2MB)
  Drive 3: photo.jpg.part.3 (data shard 3, 2MB)
  Drive 4: photo.jpg.part.4 (data shard 4, 2MB)
  Drive 5: photo.jpg.part.5 (parity shard 1, 2MB)
  Drive 6: photo.jpg.part.6 (parity shard 2, 2MB)
  Drive 7: photo.jpg.part.7 (parity shard 3, 2MB)
  Drive 8: photo.jpg.part.8 (parity shard 4, 2MB)
 
Storage used: 16 MB total (2x raw size)
Can tolerate: loss of any 4 drives
 
Reading: Need any 4 shards (data or parity) to reconstruct

Erasure Set Sizing:

The number of drives per erasure set determines fault tolerance and storage efficiency:

Drives	Data	Parity	Efficiency	Fault Tolerance
4	2	2	50%	2 drives
8	4	4	50%	4 drives
16	8	8	50%	8 drives
16	12	4	75%	4 drives

MinIO automatically calculates erasure set size based on total drives. For example, 16 drives across 4 nodes creates one erasure set of 16 drives.

Inline vs. Multi-Part:

Small objects (default <128KB) are written inline—all shards in a single write. Larger objects use multi-part upload where each part is independently striped.

Erasure Coding Constraints

Object Storage Model and On-Disk Format

MinIO stores objects in a straightforward directory structure on each drive. Understanding this format helps with troubleshooting and recovery.

On-Disk Layout:

minio-disk-layout.txt
/data1/                              # Drive mount point
├── .minio.sys/                      # System metadata
│   ├── config/                      # Cluster configuration
│   ├── buckets/                     # Bucket metadata
│   └── pool.bin                     # Pool layout
├── bucket-name/                     # User bucket
│   ├── object-name/                 # Object directory
│   │   ├── xl.meta                  # Object metadata (JSON)
│   │   └── part.1                   # Data/parity shard
│   └── prefix/
│       └── nested-object/
│           ├── xl.meta
│           └── part.1
└── another-bucket/
    └── ...

xl.meta File:

The xl.meta file contains object metadata in JSON format:

{
  "version": "1.0.0",
  "checksum": {
    "algorithm": "highwayhash256S",
    "sum": "..."
  },
  "erasure": {
    "algorithm": "ReedSolomon",
    "data": 4,
    "parity": 4
  },
  "meta": {
    "content-type": "image/jpeg",
    "x-amz-meta-custom": "value"
  },
  "parts": [...]
}

Key Points:

No separate metadata store — Metadata lives with data on each drive
Self-describing — Each shard knows its position and the overall encoding scheme
Human-readable — You can inspect metadata with standard tools
Checksums everywhere — Both metadata and data are checksummed for integrity

Disaster Recovery Advantage

S3 API Compatibility

MinIO's primary interface is the S3 API. Compatibility is extensive—most S3 SDKs and tools work without modification.

S3 API Feature Support
Category	Feature	Support
Basic Operations	GetObject, PutObject, DeleteObject	✓ Full
Bucket Operations	Create, Delete, List, Policy	✓ Full
Multipart Upload	Initiate, Upload, Complete, Abort	✓ Full
Versioning	Bucket versioning, version listing	✓ Full
Object Lock	Governance/Compliance mode, retention	✓ Full
Lifecycle	Expiration, transition rules	✓ Full
Server-Side Encryption	SSE-S3, SSE-KMS, SSE-C	✓ Full
Notifications	Bucket events to Kafka, AMQP, Webhook	✓ Full
Select API	S3 Select for CSV/JSON/Parquet	✓ Full
Access Control	Bucket policies, IAM policies	✓ Full
Presigned URLs	Temporary access URLs	✓ Full

Using Standard S3 Tools:

# AWS CLI with MinIO
aws --endpoint-url http://minio:9000 s3 ls
aws --endpoint-url http://minio:9000 s3 cp file.txt s3://bucket/

# Python boto3
import boto3
s3 = boto3.client('s3',
    endpoint_url='http://minio:9000',
    aws_access_key_id='minioadmin',
    aws_secret_access_key='minioadmin'
)
s3.upload_file('file.txt', 'bucket', 'file.txt')

# mc (MinIO Client - specialized tool)
mc alias set myminio http://minio:9000 minioadmin minioadmin
mc cp file.txt myminio/bucket/

MinIO Client (mc):

While standard S3 tools work, MinIO provides mc—a feature-rich CLI for advanced operations:

minio-client-examples.sh
# Mirror local directory to MinIO
mc mirror /local/data myminio/bucket
 
# Watch bucket for changes
mc watch myminio/bucket
 
# Find large objects
mc find myminio/bucket --larger 100MB
 
# Server-side copy between buckets
mc cp --recursive myminio/source/ myminio/dest/
 
# Diff between local and remote
mc diff /local/data myminio/bucket
 
# Admin operations
mc admin info myminio
mc admin heal myminio
mc admin update myminio

Compatibility Nuances

Versioning and Object Lock

MinIO provides full S3-compatible versioning and object lock capabilities, essential for regulatory compliance and data protection.

Bucket Versioning:

When enabled, every write creates a new version instead of overwriting:

versioning-example.txt
# Enable versioning
mc version enable myminio/bucket
 
# Upload creates version
mc cp file.txt myminio/bucket/  # Creates version V1
 
# Modify and re-upload
echo "modified" >> file.txt
mc cp file.txt myminio/bucket/  # Creates version V2
 
# Original still accessible
mc ls --versions myminio/bucket/file.txt
# file.txt (v2) 2024-01-15 10:30:00
# file.txt (v1) 2024-01-15 10:00:00
 
mc cp myminio/bucket/file.txt?versionId=v1 restored.txt

Object Lock (WORM):

Object Lock prevents objects from being deleted or overwritten, even by administrators:

Mode	Description	Can Override
Governance	Protected but can be bypassed with special permission	Yes, with `s3:BypassGovernanceRetention`
Compliance	Cannot be bypassed by anyone, including root	No, wait for retention to expire

Use Cases:

Compliance mode: Regulated industries (SEC 17a-4, HIPAA) requiring immutable records
Governance mode: Protection against accidental deletion with administrative override
Legal hold: Indefinite hold regardless of retention period

object-lock-example.sh
# Create bucket with object lock enabled
mc mb --with-lock myminio/secure-bucket
 
# Set default retention (30 days governance)
mc retention set --default GOVERNANCE 30d myminio/secure-bucket
 
# Upload with specific retention
mc cp file.txt myminio/secure-bucket/ \
  --retention-mode compliance \
  --retention-until 2025-01-01
 
# Apply legal hold
mc legalhold set myminio/secure-bucket/file.txt
 
# Attempting delete fails
mc rm myminio/secure-bucket/file.txt
# ERROR: Object is locked

Compliance Mode Is Irreversible

Multi-Site Replication

MinIO supports bucket-level replication between clusters for disaster recovery, geo-distribution, and data synchronization.

Replication Modes:

MinIO Replication Options
Mode	Description	Use Case
Server-Side Replication	Automatic, near-real-time replication	DR, multi-datacenter
Active-Active	Bidirectional replication between sites	Global active-active
Active-Passive	One-way replication to standby site	Traditional DR
Batch Replication	Scheduled bulk sync	Initial seeding, catch-up

Converting Mermaid diagram...

Configuring Bucket Replication:

# Set up remote target
mc admin bucket remote add myminio/source-bucket 
  http://accesskey:secretkey@remote-minio:9000/target-bucket 
  --service replication

# Enable replication
mc replicate add myminio/source-bucket 
  --remote-bucket target-bucket 
  --replicate "delete,delete-marker,existing-objects"

# Check replication status
mc replicate status myminio/source-bucket

What Gets Replicated:

Object data and metadata
Object versions (with versioning enabled)
Object tags
Object lock / retention settings
Delete markers (optional)
Existing objects (optional, requires batch replication)

Replication Lag Monitoring

Performance Characteristics

MinIO is designed for high performance on modern hardware. Understanding its performance characteristics helps optimize deployments.

Benchmark Results (MinIO published):

Metric	Performance
GET (read)	325 GiB/s aggregate
PUT (write)	165 GiB/s aggregate
Single object	10+ GB/s for large objects
Small objects	100k+ objects/second

Results from 32-node cluster with NVMe drives and 100GbE networking

Performance Factors:

Key Performance Considerations

•Drive type — NVMe >> SSD >> HDD. Erasure coding benefits greatly from fast random I/O.
•Network bandwidth — 25GbE or higher for NVMe clusters. Erasure read/write spans network.
•CPU — Modern multi-core for erasure encoding. Go's concurrency leverages cores well.
•Erasure set size — Larger erasure sets = more parallel I/O but higher network amplification.
•Object size — Large objects (>10MB) stream efficiently; small objects have overhead per object.
•Concurrent clients — MinIO scales with concurrency; benchmark with realistic client counts.

Small Object Optimization:

Small objects (<128KB by default) are handled specially:

Written inline to reduce I/O operations
Metadata and data in single file
Less overhead from erasure coding per object

Performance Tuning:

# Increase GOMAXPROCS for more CPU utilization
export GOMAXPROCS=
minio server ...

# Use multiple drives with high queue depth
minio server /mnt/disk{1...12}

# Enable drive caching for metadata-heavy workloads
minio server --config-dir=/config --certs-dir=/certs /data{1...4}

Realistic Performance Expectations

Kubernetes and Cloud-Native Deployment

MinIO is built for cloud-native environments, with first-class Kubernetes support through the MinIO Operator.

MinIO Operator:

The operator manages MinIO tenants (clusters) on Kubernetes:

apiVersion: minio.min.io/v2
kind: Tenant
metadata:
  name: minio-tenant
  namespace: minio
spec:
  image: minio/minio:latest
  pools:
    - servers: 4
      volumesPerServer: 4
      volumeClaimTemplate:
        spec:
          storageClassName: local-nvme
          resources:
            requests:
              storage: 1Ti
  requestAutoCert: true
  s3:
    bucketDNS: true

Key Kubernetes Features:

MinIO Kubernetes Integration
Feature	Description
Operator	Manages tenant lifecycle, upgrades, configuration
Console	Web UI for administration, metrics visualization
CSI Driver	Use MinIO as PVC backend for other workloads
Auto-TLS	Automatic certificate generation and rotation
Prometheus Integration	Native metrics export for monitoring
Horizontal Scaling	Add pools via tenant spec updates
StatefulSet Per Pool	Predictable pod naming and storage binding

minio-helm-install.sh
# Install MinIO Operator
helm repo add minio-operator https://operator.min.io
helm install minio-operator minio-operator/operator
 
# Create tenant
kubectl apply -f tenant.yaml
 
# Access MinIO Console
kubectl port-forward svc/minio-tenant-console 9443:9443
 
# Get credentials
kubectl get secret minio-tenant-user-1 -o jsonpath='{.data.CONSOLE_ACCESS_KEY}' | base64 -d

Direct-Attached Storage Recommended

Summary: MinIO Self-Hosted S3

We've explored MinIO's approach to self-hosted object storage—achieving S3 compatibility and high performance through radical simplicity. Let's consolidate the key concepts:

Key Takeaways

•Single binary simplicity — No external dependencies, easy deployment, straightforward operations.
•Erasure coding for durability — Space-efficient redundancy tolerating multiple drive failures.
•Full S3 API compatibility — Standard tools and SDKs work without modification.
•Peer-to-peer architecture — No master node, any node serves any request.
•Self-describing format — Objects stored as files with inline metadata, enabling manual recovery.
•Versioning and object lock — Enterprise data protection and compliance features.
•Multi-site replication — DR and geo-distribution capabilities.
•Cloud-native first — Kubernetes operator for production-grade deployments.

What's Next:

Page Complete

4 / 5