Implementing Azure RBAC for AKS Cluster Access Control
Hands-on proof of concept demonstrating Azure AD integration with AKS for granular RBAC, including role assignments, permission validation, and troubleshooting steps.
Overview
In this hands-on proof of concept, I’ll walk you through implementing Azure Role-Based Access Control (RBAC) for Azure Kubernetes Service (AKS) cluster access. We’ll integrate Azure Active Directory (Azure AD) for authentication and authorization, giving you centralized identity management, granular permissions, and proper audit trails—all the things you need for production-ready cluster access.
What You’ll Learn:
- Enable Azure AD integration for AKS
- Assign Azure RBAC roles to users and groups
- Validate permissions using
kubectl auth can-i - Implement least-privilege access patterns
- Troubleshoot common RBAC issues (and believe me, there are a few)
Prerequisites:
- Azure subscription with permissions to create AKS clusters
- Azure AD with user/group management permissions
- Azure CLI 2.50+ installed
- kubectl 1.28+ installed
- Terraform 1.5+ (optional, for IaC approach)
Architecture Overview
graph TD
subgraph AAD["Azure Active Directory"]
Users["Users"]
Groups["Groups"]
SP["Service
Principals"]
end
AAD -->|"OIDC Authentication"| API
subgraph AKS["AKS Cluster"]
API["Kubernetes API Server"]
subgraph RBAC["Azure RBAC Authorization"]
BuiltIn["Built-in Roles
• Cluster Admin
• Cluster User
• Reader"]
Custom["Custom Roles
• Namespace Developer
• Read-Only"]
end
API --> RBAC
RBAC --> Resources["Kubernetes Resources
• Pods
• Deployments
• Services
• Namespaces"]
end
Users -.->|"kubectl"| API
Groups -.->|"kubectl"| API
SP -.->|"API calls"| API
style AAD fill:#fff3cd
style AKS fill:#e1f5ff
style RBAC fill:#d4edda
style Resources fill:#f8d7da
Part 1: Cluster Setup with Azure AD Integration
Option 1: Azure CLI
# Set variables
RESOURCE_GROUP="rbac-poc-rg"
CLUSTER_NAME="rbac-aks-cluster"
LOCATION="eastus"
# Create resource group
az group create \
--name $RESOURCE_GROUP \
--location $LOCATION
# Create AKS cluster with Azure AD and Azure RBAC
az aks create \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--kubernetes-version 1.28.5 \
--node-count 2 \
--node-vm-size Standard_D2s_v5 \
--enable-aad \
--enable-azure-rbac \
--generate-ssh-keys
# Get cluster credentials (admin access for setup)
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--admin # Use --admin for initial setup only
# Verify cluster access
kubectl get nodes
Option 2: Terraform (Recommended for Production)
# terraform/main.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.80"
}
azuread = {
source = "hashicorp/azuread"
version = "~> 2.45"
}
}
}
provider "azurerm" {
features {}
}
provider "azuread" {}
# Data sources
data "azurerm_client_config" "current" {}
data "azuread_client_config" "current" {}
# Resource group
resource "azurerm_resource_group" "aks" {
name = "rbac-poc-rg"
location = "eastus"
tags = {
Environment = "POC"
Purpose = "RBAC-Demo"
}
}
# AKS cluster with Azure AD and Azure RBAC
resource "azurerm_kubernetes_cluster" "aks" {
name = "rbac-aks-cluster"
location = azurerm_resource_group.aks.location
resource_group_name = azurerm_resource_group.aks.name
dns_prefix = "rbac-aks"
kubernetes_version = "1.28.5"
default_node_pool {
name = "default"
node_count = 2
vm_size = "Standard_D2s_v5"
upgrade_settings {
max_surge = "33%"
}
}
identity {
type = "SystemAssigned"
}
# Azure AD integration
azure_active_directory_role_based_access_control {
managed = true
azure_rbac_enabled = true
tenant_id = data.azurerm_client_config.current.tenant_id
}
network_profile {
network_plugin = "azure"
network_policy = "calico"
}
tags = {
Environment = "POC"
}
}
# Outputs
output "cluster_id" {
value = azurerm_kubernetes_cluster.aks.id
}
output "cluster_fqdn" {
value = azurerm_kubernetes_cluster.aks.fqdn
}
output "kube_config_command" {
value = "az aks get-credentials --resource-group ${azurerm_resource_group.aks.name} --name ${azurerm_kubernetes_cluster.aks.name}"
}
Apply Terraform configuration:
cd terraform
terraform init
terraform plan
terraform apply -auto-approve
# Get cluster credentials
eval $(terraform output -raw kube_config_command)
Part 2: Azure AD User and Group Setup
Create Azure AD Groups
Here’s where we set up the identity foundation. I always recommend using groups instead of individual user assignments—it makes life so much easier as your team grows.
# Create Azure AD groups for different access levels
az ad group create \
--display-name "AKS-Cluster-Admins" \
--mail-nickname "aks-cluster-admins" \
--description "Full administrative access to AKS cluster"
az ad group create \
--display-name "AKS-Developers" \
--mail-nickname "aks-developers" \
--description "Developer access to development namespaces"
az ad group create \
--display-name "AKS-Viewers" \
--mail-nickname "aks-viewers" \
--description "Read-only access to AKS cluster"
# Get group object IDs
ADMIN_GROUP_ID=$(az ad group show \
--group "AKS-Cluster-Admins" \
--query id -o tsv)
DEVELOPER_GROUP_ID=$(az ad group show \
--group "AKS-Developers" \
--query id -o tsv)
VIEWER_GROUP_ID=$(az ad group show \
--group "AKS-Viewers" \
--query id -o tsv)
echo "Admin Group ID: $ADMIN_GROUP_ID"
echo "Developer Group ID: $DEVELOPER_GROUP_ID"
echo "Viewer Group ID: $VIEWER_GROUP_ID"
Terraform Implementation
# terraform/aad-groups.tf
resource "azuread_group" "aks_admins" {
display_name = "AKS-Cluster-Admins"
mail_nickname = "aks-cluster-admins"
security_enabled = true
description = "Full administrative access to AKS cluster"
}
resource "azuread_group" "aks_developers" {
display_name = "AKS-Developers"
mail_nickname = "aks-developers"
security_enabled = true
description = "Developer access to development namespaces"
}
resource "azuread_group" "aks_viewers" {
display_name = "AKS-Viewers"
mail_nickname = "aks-viewers"
security_enabled = true
description = "Read-only access to AKS cluster"
}
output "admin_group_id" {
value = azuread_group.aks_admins.object_id
}
output "developer_group_id" {
value = azuread_group.aks_developers.object_id
}
output "viewer_group_id" {
value = azuread_group.aks_viewers.object_id
}
Add Test Users to Groups
# Get current user's object ID
USER_ID=$(az ad signed-in-user show --query id -o tsv)
# Add yourself to admin group for testing
az ad group member add \
--group "AKS-Cluster-Admins" \
--member-id $USER_ID
# Create test users (optional)
az ad user create \
--display-name "Jane Developer" \
--user-principal-name jane.developer@yourdomain.onmicrosoft.com \
--password "TempPassword123!" \
--force-change-password-next-sign-in true
JANE_ID=$(az ad user show \
--id jane.developer@yourdomain.onmicrosoft.com \
--query id -o tsv)
az ad group member add \
--group "AKS-Developers" \
--member-id $JANE_ID
Part 3: Azure RBAC Role Assignments
Built-in AKS Roles
Let’s talk about the roles Azure gives you out of the box. There are four built-in roles for AKS, and they cover most use cases:
| Role | Scope | Permissions |
|---|---|---|
| Azure Kubernetes Service RBAC Cluster Admin | Cluster-wide | Full access to all resources |
| Azure Kubernetes Service RBAC Admin | Namespace or cluster | Admin access, can manage roles |
| Azure Kubernetes Service RBAC Writer | Namespace or cluster | Read/write access to most resources |
| Azure Kubernetes Service RBAC Reader | Namespace or cluster | Read-only access |
Assign Cluster-Wide Roles
# Get AKS cluster resource ID
CLUSTER_ID=$(az aks show \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--query id -o tsv)
# Assign cluster admin role to admin group
az role assignment create \
--role "Azure Kubernetes Service RBAC Cluster Admin" \
--assignee $ADMIN_GROUP_ID \
--scope $CLUSTER_ID
# Assign reader role to viewer group
az role assignment create \
--role "Azure Kubernetes Service RBAC Reader" \
--assignee $VIEWER_GROUP_ID \
--scope $CLUSTER_ID
Terraform Implementation
# terraform/rbac-assignments.tf
# Cluster admin role for admin group
resource "azurerm_role_assignment" "aks_cluster_admin" {
scope = azurerm_kubernetes_cluster.aks.id
role_definition_name = "Azure Kubernetes Service RBAC Cluster Admin"
principal_id = azuread_group.aks_admins.object_id
}
# Reader role for viewer group
resource "azurerm_role_assignment" "aks_reader" {
scope = azurerm_kubernetes_cluster.aks.id
role_definition_name = "Azure Kubernetes Service RBAC Reader"
principal_id = azuread_group.aks_viewers.object_id
}
Namespace-Scoped Role Assignments
Now here’s where it gets powerful. You can scope roles to specific namespaces, which is perfect for multi-tenant scenarios:
# Create namespaces
kubectl create namespace development
kubectl create namespace staging
kubectl create namespace production
# Assign developer group as RBAC Writer to development namespace
az role assignment create \
--role "Azure Kubernetes Service RBAC Writer" \
--assignee $DEVELOPER_GROUP_ID \
--scope "$CLUSTER_ID/namespaces/development"
# Assign developer group as RBAC Reader to staging
az role assignment create \
--role "Azure Kubernetes Service RBAC Reader" \
--assignee $DEVELOPER_GROUP_ID \
--scope "$CLUSTER_ID/namespaces/staging"
# No access to production namespace for developers
Terraform: Namespace-Scoped Assignments
# terraform/namespaces.tf
# Create namespaces
resource "kubernetes_namespace" "development" {
metadata {
name = "development"
labels = {
environment = "dev"
}
}
}
resource "kubernetes_namespace" "staging" {
metadata {
name = "staging"
labels = {
environment = "staging"
}
}
}
resource "kubernetes_namespace" "production" {
metadata {
name = "production"
labels = {
environment = "prod"
}
}
}
# Namespace-scoped role assignments
resource "azurerm_role_assignment" "dev_writer" {
scope = "${azurerm_kubernetes_cluster.aks.id}/namespaces/${kubernetes_namespace.development.metadata[0].name}"
role_definition_name = "Azure Kubernetes Service RBAC Writer"
principal_id = azuread_group.aks_developers.object_id
}
resource "azurerm_role_assignment" "staging_reader" {
scope = "${azurerm_kubernetes_cluster.aks.id}/namespaces/${kubernetes_namespace.staging.metadata[0].name}"
role_definition_name = "Azure Kubernetes Service RBAC Reader"
principal_id = azuread_group.aks_developers.object_id
}
Part 4: Permission Validation
Test Cluster Admin Access
Time to validate that everything works. This is the fun part—seeing your RBAC setup in action.
# Get non-admin credentials
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--overwrite-existing
# Test cluster-wide access (as admin group member)
kubectl auth can-i get nodes
# Expected: yes
kubectl auth can-i create namespace
# Expected: yes
kubectl auth can-i delete deployment -n production
# Expected: yes (cluster admin has all permissions)
Test Developer Access
Let’s simulate what a developer (Jane) can do. This is where you’ll see the power of namespace-scoped permissions:
# Get credentials as Jane
az login --username jane.developer@yourdomain.onmicrosoft.com
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--overwrite-existing
# Test development namespace access (writer role)
kubectl auth can-i create deployment -n development
# Expected: yes
kubectl auth can-i get pods -n development
# Expected: yes
kubectl auth can-i delete pod -n development
# Expected: yes
# Test staging namespace access (reader role)
kubectl auth can-i get deployments -n staging
# Expected: yes
kubectl auth can-i create deployment -n staging
# Expected: no
# Test production namespace access (no role)
kubectl auth can-i get pods -n production
# Expected: no
# Test cluster-wide operations
kubectl auth can-i get nodes
# Expected: no
kubectl auth can-i create namespace
# Expected: no
Test Viewer Access
# As viewer group member
kubectl auth can-i get pods --all-namespaces
# Expected: yes
kubectl auth can-i get deployments -n development
# Expected: yes
kubectl auth can-i create deployment -n development
# Expected: no
kubectl auth can-i delete service -n staging
# Expected: no
Comprehensive Permission Audit Script
#!/bin/bash
# audit-permissions.sh
NAMESPACES=("development" "staging" "production")
RESOURCES=("pods" "deployments" "services" "configmaps" "secrets")
VERBS=("get" "list" "create" "update" "delete")
echo "=== RBAC Permission Audit ==="
echo "User: $(az account show --query user.name -o tsv)"
echo ""
for ns in "${NAMESPACES[@]}"; do
echo "Namespace: $ns"
echo "----------------------------------------"
for resource in "${RESOURCES[@]}"; do
echo -n "$resource: "
for verb in "${VERBS[@]}"; do
result=$(kubectl auth can-i $verb $resource -n $ns 2>/dev/null)
if [ "$result" == "yes" ]; then
echo -n "$verb "
fi
done
echo ""
done
echo ""
done
echo "Cluster-wide permissions:"
echo "----------------------------------------"
kubectl auth can-i get nodes && echo "✓ Get nodes" || echo "✗ Get nodes"
kubectl auth can-i create namespace && echo "✓ Create namespace" || echo "✗ Create namespace"
kubectl auth can-i get clusterroles && echo "✓ Get cluster roles" || echo "✗ Get cluster roles"
Run audit:
chmod +x audit-permissions.sh
./audit-permissions.sh
Part 5: Custom Role Definitions
Sometimes the built-in roles don’t quite fit your needs. That’s where custom roles come in. I’ve found this especially useful when you need to deny access to secrets while allowing everything else.
Custom Role Permissions Comparison
| Permission Category | Built-in Admin | Built-in Writer | Built-in Reader | Custom Namespace Developer |
|---|---|---|---|---|
| Pods | ✅ All | ✅ Read/Write | ✅ Read only | ✅ Read + Logs |
| Deployments | ✅ All | ✅ Read/Write | ✅ Read only | ✅ Full control |
| Services | ✅ All | ✅ Read/Write | ✅ Read only | ✅ Full control |
| ConfigMaps | ✅ All | ✅ Read/Write | ✅ Read only | ✅ Full control |
| Secrets | ✅ All | ✅ Read/Write | ✅ Read only | ❌ No access |
| Namespaces | ✅ All | ❌ No | ✅ Read only | ❌ No access |
| Nodes | ✅ All | ❌ No | ✅ Read only | ❌ No access |
| RBAC/Roles | ✅ All | ❌ No | ✅ Read only | ❌ No access |
| PVs/PVCs | ✅ All | ✅ Read/Write | ✅ Read only | ❌ No access |
| Scope | Cluster-wide | Namespace or Cluster | Namespace or Cluster | Namespace only |
Custom Role Benefits:
Why bother with custom roles? Here’s what I’ve found they’re good for:
- Fine-grained permissions tailored to your specific team workflows
- Explicitly denying access to sensitive resources (like secrets)
- Restricting cluster-wide operations
- Meeting specific audit and compliance requirements
Custom Role: Namespace Developer
// custom-role-namespace-developer.json
{
"Name": "AKS Namespace Developer",
"Description": "Can manage deployments, services, and configmaps in assigned namespaces",
"Actions": [],
"NotActions": [],
"DataActions": [
"Microsoft.ContainerService/managedClusters/apps/deployments/read",
"Microsoft.ContainerService/managedClusters/apps/deployments/write",
"Microsoft.ContainerService/managedClusters/apps/deployments/delete",
"Microsoft.ContainerService/managedClusters/services/read",
"Microsoft.ContainerService/managedClusters/services/write",
"Microsoft.ContainerService/managedClusters/services/delete",
"Microsoft.ContainerService/managedClusters/configmaps/read",
"Microsoft.ContainerService/managedClusters/configmaps/write",
"Microsoft.ContainerService/managedClusters/pods/read",
"Microsoft.ContainerService/managedClusters/pods/log/read"
],
"NotDataActions": [
"Microsoft.ContainerService/managedClusters/secrets/*"
],
"AssignableScopes": [
"/subscriptions/{subscription-id}"
]
}
Create custom role:
az role definition create \
--role-definition custom-role-namespace-developer.json
# Get custom role ID
CUSTOM_ROLE_ID=$(az role definition list \
--name "AKS Namespace Developer" \
--query [0].id -o tsv)
# Assign custom role
az role assignment create \
--role "AKS Namespace Developer" \
--assignee $DEVELOPER_GROUP_ID \
--scope "$CLUSTER_ID/namespaces/development"
Terraform: Custom Role
# terraform/custom-roles.tf
resource "azurerm_role_definition" "namespace_developer" {
name = "AKS Namespace Developer"
scope = data.azurerm_subscription.current.id
description = "Can manage deployments, services, and configmaps in assigned namespaces"
permissions {
actions = []
not_actions = []
data_actions = [
"Microsoft.ContainerService/managedClusters/apps/deployments/read",
"Microsoft.ContainerService/managedClusters/apps/deployments/write",
"Microsoft.ContainerService/managedClusters/apps/deployments/delete",
"Microsoft.ContainerService/managedClusters/services/read",
"Microsoft.ContainerService/managedClusters/services/write",
"Microsoft.ContainerService/managedClusters/services/delete",
"Microsoft.ContainerService/managedClusters/configmaps/read",
"Microsoft.ContainerService/managedClusters/configmaps/write",
"Microsoft.ContainerService/managedClusters/pods/read",
"Microsoft.ContainerService/managedClusters/pods/log/read",
]
not_data_actions = [
"Microsoft.ContainerService/managedClusters/secrets/*",
]
}
assignable_scopes = [
data.azurerm_subscription.current.id
]
}
Part 6: Troubleshooting
I’ve debugged these issues more times than I can count. Let me save you some pain.
Issue 1: “Error: You must be logged in to the server (Unauthorized)”
Cause: You’re either not authenticated with Azure AD, or your credentials expired.
Fix:
# Re-authenticate
az login
# Get fresh credentials
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--overwrite-existing
# Verify authentication
kubectl get nodes
Issue 2: “Error: pods is forbidden: User cannot list resource”
Cause: This one’s straightforward—you don’t have the right Azure RBAC permissions.
Diagnosis:
# Check your role assignments
az role assignment list \
--assignee $(az ad signed-in-user show --query id -o tsv) \
--scope $CLUSTER_ID \
--output table
# Check what you can do
kubectl auth can-i --list
Fix: Assign appropriate role or add user to correct Azure AD group.
Issue 3: Role Propagation Delay
Symptom: You just assigned a role, but permissions aren’t working yet. Sound familiar?
Fix: Here’s the thing—Azure RBAC can take up to 5 minutes to propagate. I know it’s frustrating, but patience is key here.
# Clear kubectl cache
rm -rf ~/.kube/cache
# Wait 5 minutes, then test again
kubectl auth can-i get pods -n development
Issue 4: Cluster Admin vs. Local Admin Confusion
Understanding:
This one trips up a lot of people. Here’s the key difference:
--adminflag bypasses Azure RBAC entirely (uses local cluster admin certificate)- Regular credentials use Azure AD + Azure RBAC
Best Practice:
# For normal operations (use Azure RBAC)
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME
# For emergency/troubleshooting only
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--admin
Debugging Tools
# View kubeconfig contexts
kubectl config get-contexts
# View current context details
kubectl config view --minify
# Check Azure AD token
kubectl get --raw /api/v1/namespaces/default/pods --v=9 | grep Authorization
# Validate role assignments
az role assignment list \
--scope $CLUSTER_ID \
--include-inherited \
--output table
Part 7: Best Practices
Let me share some best practices I’ve learned the hard way.
1. Least Privilege Principle
This is security 101, but I still see people getting it wrong:
# Bad: Granting cluster admin to all developers
az role assignment create \
--role "Azure Kubernetes Service RBAC Cluster Admin" \
--assignee $DEVELOPER_GROUP_ID \
--scope $CLUSTER_ID
# Good: Namespace-scoped writer role
az role assignment create \
--role "Azure Kubernetes Service RBAC Writer" \
--assignee $DEVELOPER_GROUP_ID \
--scope "$CLUSTER_ID/namespaces/development"
2. Use Azure AD Groups, Not Individual Users
I can’t stress this enough—always use groups, never assign roles to individual users:
# Bad: Assigning roles to individual users
az role assignment create \
--role "Azure Kubernetes Service RBAC Reader" \
--assignee jane.developer@example.com \
--scope $CLUSTER_ID
# Good: Assigning roles to groups
az role assignment create \
--role "Azure Kubernetes Service RBAC Reader" \
--assignee $VIEWER_GROUP_ID \
--scope $CLUSTER_ID
3. Separate Production Access
# Create dedicated production admin group
resource "azuread_group" "prod_admins" {
display_name = "AKS-Production-Admins"
security_enabled = true
}
# Require conditional access policy for production
# (configured in Azure AD portal)
4. Enable Audit Logging
# Enable diagnostic settings for AKS
az monitor diagnostic-settings create \
--name aks-audit-logs \
--resource $CLUSTER_ID \
--logs '[
{
"category": "kube-audit",
"enabled": true
},
{
"category": "kube-audit-admin",
"enabled": true
}
]' \
--workspace /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/aks-logs
5. Regular Access Reviews
Don’t set it and forget it. I recommend quarterly access reviews at minimum:
# Script: review-rbac-assignments.sh
#!/bin/bash
echo "=== AKS RBAC Access Review ==="
echo "Date: $(date)"
echo ""
az role assignment list \
--scope $CLUSTER_ID \
--include-inherited \
--query "[].{Principal:principalName, Role:roleDefinitionName, Scope:scope}" \
--output table
Cleanup
# Delete role assignments
az role assignment delete \
--role "Azure Kubernetes Service RBAC Cluster Admin" \
--assignee $ADMIN_GROUP_ID \
--scope $CLUSTER_ID
# Delete AKS cluster
az aks delete \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--yes --no-wait
# Delete resource group (includes all resources)
az group delete \
--name $RESOURCE_GROUP \
--yes --no-wait
# Or with Terraform
terraform destroy -auto-approve
Conclusion
If you’ve followed along, you now have a solid foundation for Azure RBAC in AKS. This POC covered:
- Centralized identity management via Azure AD
- Granular permissions at cluster and namespace levels
- Built-in and custom role definitions
- Validation and troubleshooting techniques (the real-world stuff)
- Production-ready best practices
Next Steps:
Don’t stop here. Once you’ve got the basics working, I recommend:
- Implementing conditional access policies for production clusters
- Integrating with Azure PIM for just-in-time access
- Setting up automated RBAC reviews and alerts
- Extending this RBAC model to multi-cluster environments
The setup might seem like a lot of work upfront, but trust me—having proper access control from day one is so much easier than trying to retrofit it later. I’ve seen both scenarios, and you don’t want to be in the second one.
Additional Resources
- AKS-managed Azure AD integration
- Azure RBAC for Kubernetes Authorization
- Azure built-in roles for AKS
About StriveNimbus: We specialize in Kubernetes security architecture, zero-trust implementations, and compliance automation for Azure environments. Our team helps organizations implement secure, scalable access control patterns. Contact us for security assessment and implementation support.