Loading content...
In 2020, a security researcher discovered that Tesla's Kubernetes dashboard was exposed to the internet without authentication. Attackers had been using Tesla's cloud infrastructure to mine cryptocurrency. The vulnerability wasn't in Kubernetes itself—it was an access control misconfiguration. This incident illustrates a universal truth: identity and access management (IAM) is the most critical security control in cloud computing.
Every interaction with cloud resources—every API call, every service invocation, every data access—is mediated by IAM. A single overly permissive policy can expose your entire infrastructure. A single misconfigured role can give attackers the keys to everything. Conversely, a well-designed IAM architecture creates defense in depth that can contain breaches and limit blast radius.
IAM is simultaneously the most powerful security tool and the most common source of security failures in cloud environments. Mastering it is non-negotiable for any cloud architect.
By the end of this page, you will understand IAM architectures across major cloud providers, the principle of least privilege and how to implement it, identity federation and single sign-on, service identities and machine-to-machine authentication, and common IAM anti-patterns that lead to breaches.
Before diving into cloud-specific implementations, we must establish clear definitions of the two distinct functions IAM provides:
Authentication: Proving who (or what) you are
Authorization: Determining what you can do
These two functions are often conflated, but they're fundamentally different. A strong password doesn't matter if the authenticated user has permission to do things they shouldn't. Precise permissions don't matter if anyone can claim any identity.
| Aspect | Authentication | Authorization |
|---|---|---|
| Question | Who are you? | What can you do? |
| Mechanism | Credentials, MFA, certificates | Policies, roles, permissions |
| Failure mode | Impersonation, credential theft | Privilege escalation, over-permission |
| Point in flow | Before authorization | After authentication |
| Cloud examples | Login, assume role, federated identity | IAM policies, resource policies, ACLs |
Identity Types in Cloud Environments:
Cloud IAM systems must handle multiple types of identities:
Each identity type requires different handling. Human users need MFA and password policies. Service accounts need tight scoping and rotation. Federated identities need trust relationships and attribute mapping. A mature IAM architecture addresses all of these.
Every cloud account has a 'root' or 'owner' account with unlimited privileges. This account should be protected with MFA, used only for initial setup and emergency recovery, and never for day-to-day operations. Many breaches escalate because root credentials were compromised or left exposed.
The Principle of Least Privilege (PoLP) is the foundational concept of IAM security: grant only the minimum permissions necessary to perform a task, and nothing more. This principle is simple to state but challenging to implement—it requires ongoing effort, tooling, and organizational commitment.
Why Least Privilege Matters:
Limits Blast Radius — If credentials are compromised, attackers can only do what those credentials allow. An over-privileged service account with admin access gives attackers everything. A narrowly-scoped account limits damage.
Prevents Lateral Movement — Attackers who gain initial access try to move laterally to higher-value targets. Least privilege makes each hop harder, creating multiple opportunities to detect and stop intrusions.
Reduces Accidental Damage — Overly broad permissions enable costly mistakes. A developer with production delete permissions can accidentally destroy data. Least privilege protects against human error.
Simplifies Auditing — Narrow permissions create clear audit trails. When each identity has specific, documented permissions, anomalies stand out. Broad permissions make normal and malicious activity indistinguishable.
* for resources, specify exact resources or use tagging-based conditions to limit scope.s3:*; grant s3:GetObject.The Permission Creep Problem:
In practice, permissions tend to expand over time—a phenomenon called permission creep:
Without active management, every identity eventually accumulates far more permissions than needed. Organizations must implement:
A practical approach: when defining permissions for a new service or role, write down exactly what operations it needs to perform. Then grant those specific permissions. If the service fails with 'access denied,' add only the specific missing permission—don't broaden to fix problems.
Amazon Web Services implements one of the most comprehensive IAM systems in cloud computing. Understanding AWS IAM in depth provides a model that transfers to other clouds, as the concepts are universal even if terminology differs.
Core AWS IAM Components:
AWS IAM Policy Structure:
AWS policies use a JSON structure that, once understood, applies to nearly all AWS authorization decisions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowS3Read",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
],
"Condition": {
"IpAddress": {
"aws:SourceIp": "192.168.1.0/24"
}
}
}
]
}
Policy Elements Explained:
| Element | Purpose | Best Practice |
|---|---|---|
| Effect | Allow or Deny | Explicit denies override allows |
| Action | API operations | Specify exact actions, avoid wildcards |
| Resource | Target ARNs | Use specific ARNs or tag conditions |
| Condition | Contextual constraints | Add IP, time, MFA, tag conditions |
| Principal | Who the policy applies to | Used in resource policies |
Policy Evaluation Logic:
AWS evaluates policies in a specific order:
If no policy explicitly allows an action, it's denied by default. This default-deny approach means permissions must be explicitly granted.
For EC2 instances, Lambda functions, and other AWS services, always use IAM roles rather than IAM user credentials. Roles provide temporary credentials that rotate automatically, eliminating the risk of long-lived access keys being exposed or forgotten in code.
While AWS IAM provides a comprehensive reference model, Azure Active Directory (Entra ID) and Google Cloud IAM implement similar concepts with different abstractions. Understanding these differences is essential for multi-cloud and hybrid architectures.
| Concept | AWS | Azure | GCP |
|---|---|---|---|
| Machine identity | IAM Role | Service Principal / Managed Identity | Service Account |
| Human identity | IAM User | Entra ID User | Cloud Identity User |
| Permission grouping | IAM Group, Role | Entra ID Group, RBAC Role | Group, Role |
| Policy attachment | Identity or Resource | Scope (hierarchical) | Resource (hierarchical) |
| Federation | SAML, OIDC Providers | B2B, B2C, SAML | Workforce Identity Federation |
| Cross-account access | Assume Role | Lighthouse, B2B | Cross-project IAM |
Key Architectural Differences:
AWS: Policy-centric. Permissions are defined in JSON policy documents attached to identities or resources. Very flexible but can become complex with many policies interacting.
Azure: RBAC-centric. Built around predefined roles assigned at scopes. Integrates deeply with Entra ID (formerly Azure AD), making it natural for organizations already using Microsoft identity.
GCP: Resource-hierarchy-centric. Permissions inherit down the organization → folder → project → resource hierarchy. Policies are simple bindings of members to roles at each level.
Multi-Cloud IAM Strategies:
When operating across clouds, organizations typically:
Federate from a central IdP — Use one identity provider (corporate AD, Okta, etc.) and federate to all clouds. Users authenticate once and access any cloud.
Standardize role definitions — Create equivalent roles across clouds: 'Developer' in AWS ≈ 'Contributor' in Azure ≈ custom role in GCP with similar permissions.
Use cloud-native tools for each — Accept that tools differ but apply consistent principles: least privilege, MFA, regular review.
Centralize monitoring — Aggregate IAM audit logs from all clouds to detect suspicious patterns.
Identity Federation allows users to authenticate with one system (the Identity Provider, or IdP) and access resources in another system (the Service Provider, or SP) without maintaining separate credentials. For cloud security, federation is essential—it keeps identity management centralized while enabling access to distributed cloud resources.
Why Federation Matters:
Federation Architecture:
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ │ │ │ │ │
│ User Browser │ ──1──▸ │ Cloud Console │ ──2──▸ │ Corporate IdP │
│ │ │ (Service Provider) │ (e.g., Okta) │
│ │ ◂──5── │ │ ◂──4── │ │
│ │ │ │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
│ 3
▼
┌─────────────────┐
│ Corporate AD / │
│ User Directory │
└─────────────────┘
Flow:
1. User accesses cloud console
2. Console redirects to IdP for authentication
3. IdP authenticates user against corporate directory
4. IdP returns signed assertion (SAML) or ID token (OIDC)
5. Console grants access based on assertion/token claims
Attribute Mapping:
The IdP includes attributes (claims) in the assertion that the cloud provider uses for authorization:
| IdP Attribute | Cloud Mapping | Authorization Use |
|---|---|---|
| User identifier | Unique user identity | |
| Groups | IAM roles/groups | Permission assignment |
| Department | Tags/labels | Cost allocation, policies |
| Manager | Audit trails | Access approval workflows |
| MFA status | Session conditions | Require MFA for sensitive ops |
Proper attribute mapping is critical—it determines what roles users assume and what resources they can access.
With SCIM (System for Cross-domain Identity Management), user accounts can be automatically provisioned and deprovisioned in cloud systems as they're added or removed from the IdP. This eliminates manual account management and ensures immediate offboarding.
While human IAM gets significant attention, service identities—the machine-to-machine authentication mechanisms—are equally critical. In modern cloud architectures, the majority of API calls come from automated systems, not humans. Securing these non-human identities requires specific patterns.
The Service Identity Challenge:
Unlike humans, machines can't:
But machines also have advantages:
AWS: Instance Profiles and IAM Roles
┌─────────────────────────────────────────────┐
│ EC2 Instance │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ Application requests credentials from │ │
│ │ Instance Metadata Service (IMDS): │ │
│ │ http://169.254.169.254/... │ │
│ └────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────┐ │
│ │ IMDS returns temporary credentials: │ │
│ │ - Access Key ID │ │
│ │ - Secret Access Key │ │
│ │ - Session Token │ │
│ │ - Expiration (typically 6 hours) │ │
│ └────────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ AWS STS (Security Token Service) │
│ Manages temporary credential issuance │
│ based on IAM role trust policy │
└─────────────────────────────────────────────┘
Security Best Practices for Service Identities:
| Practice | Rationale | Implementation |
|---|---|---|
| Use instance/managed identities | No credentials in code or config | Attach roles to compute resources |
| Implement IMDSv2 | Prevents SSRF credential theft | Require session tokens for metadata |
| Rotate static credentials | Limits window of compromise | Automate rotation via secrets manager |
| Least privilege scoping | Limits blast radius | Grant only required permissions |
| Audit credential usage | Detect anomalies | CloudTrail, Cloud Audit Logs |
| Use condition keys | Enforce context requirements | Source IP, VPC, encryption context |
Server-Side Request Forgery (SSRF) attacks can steal instance credentials by making the server request its own metadata service. The Capital One breach exploited this. Mitigate by using IMDSv2, implementing WAF rules, and never trusting user-provided URLs for server-side requests.
Understanding what NOT to do is as important as knowing best practices. These anti-patterns appear repeatedly in breach reports and audit findings.
"Action": "*" or "Resource": "*" grants far more than intended. Attackers dream of finding these overly permissive policies.Real-World Breach Case Studies:
Case 1: Uber (2016) Developers committed AWS credentials to a private GitHub repository. Attackers who breached GitHub found the credentials and accessed an S3 bucket containing 57 million user records. The fix: never commit credentials, use instance roles, scan repos for secrets.
Case 2: Capital One (2019) A misconfigured WAF role had excessive permissions. An SSRF vulnerability allowed an attacker to steal instance credentials via the metadata service. Those credentials had access to S3 buckets with sensitive data. The fix: least privilege, IMDSv2, validation that roles can't access data they shouldn't.
Case 3: Twitch (2021) An exposed server configuration allowed access to internal Git repositories and AWS credentials. Overly broad internal access meant one exposure led to complete code and data exfiltration. The fix: network segmentation, credential scoping, zero trust internal architecture.
Common Themes:
Implement automated scanning for credentials in code repos, container images, and configuration files. Tools like git-secrets, trufflehog, and cloud provider secret scanners can catch exposed credentials before attackers do. This is not optional—it's essential.
Identity and Access Management is the cornerstone of cloud security—the mechanism through which every action is authenticated and authorized. Getting IAM right prevents breaches; getting it wrong enables them.
What's Next:
With IAM controlling who can access cloud resources, we next examine network security—the controls that determine what traffic can reach those resources in the first place. VPCs, security groups, network ACLs, and related controls form the network perimeter that complements IAM's identity perimeter.
You now understand cloud Identity and Access Management deeply—from fundamentals through provider-specific implementations to federation and service identities. This knowledge is essential for designing secure cloud architectures that protect resources while enabling legitimate access.