Implementing Azure RBAC for AKS Cluster Access Control

Hands-on proof of concept demonstrating Azure AD integration with AKS for granular RBAC, including role assignments, permission validation, and troubleshooting steps.

Overview

In this hands-on proof of concept, I’ll walk you through implementing Azure Role-Based Access Control (RBAC) for Azure Kubernetes Service (AKS) cluster access. We’ll integrate Azure Active Directory (Azure AD) for authentication and authorization, giving you centralized identity management, granular permissions, and proper audit trails—all the things you need for production-ready cluster access.

What You’ll Learn:

  • Enable Azure AD integration for AKS
  • Assign Azure RBAC roles to users and groups
  • Validate permissions using kubectl auth can-i
  • Implement least-privilege access patterns
  • Troubleshoot common RBAC issues (and believe me, there are a few)

Prerequisites:

  • Azure subscription with permissions to create AKS clusters
  • Azure AD with user/group management permissions
  • Azure CLI 2.50+ installed
  • kubectl 1.28+ installed
  • Terraform 1.5+ (optional, for IaC approach)

Architecture Overview

graph TD
    subgraph AAD["Azure Active Directory"]
        Users["Users"]
        Groups["Groups"]
        SP["Service
Principals"] end AAD -->|"OIDC Authentication"| API subgraph AKS["AKS Cluster"] API["Kubernetes API Server"] subgraph RBAC["Azure RBAC Authorization"] BuiltIn["Built-in Roles
• Cluster Admin
• Cluster User
• Reader"] Custom["Custom Roles
• Namespace Developer
• Read-Only"] end API --> RBAC RBAC --> Resources["Kubernetes Resources
• Pods
• Deployments
• Services
• Namespaces"] end Users -.->|"kubectl"| API Groups -.->|"kubectl"| API SP -.->|"API calls"| API style AAD fill:#fff3cd style AKS fill:#e1f5ff style RBAC fill:#d4edda style Resources fill:#f8d7da

Part 1: Cluster Setup with Azure AD Integration

Option 1: Azure CLI

# Set variables
RESOURCE_GROUP="rbac-poc-rg"
CLUSTER_NAME="rbac-aks-cluster"
LOCATION="eastus"

# Create resource group
az group create \
  --name $RESOURCE_GROUP \
  --location $LOCATION

# Create AKS cluster with Azure AD and Azure RBAC
az aks create \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --kubernetes-version 1.28.5 \
  --node-count 2 \
  --node-vm-size Standard_D2s_v5 \
  --enable-aad \
  --enable-azure-rbac \
  --generate-ssh-keys

# Get cluster credentials (admin access for setup)
az aks get-credentials \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --admin  # Use --admin for initial setup only

# Verify cluster access
kubectl get nodes
# terraform/main.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.80"
    }
    azuread = {
      source  = "hashicorp/azuread"
      version = "~> 2.45"
    }
  }
}

provider "azurerm" {
  features {}
}

provider "azuread" {}

# Data sources
data "azurerm_client_config" "current" {}

data "azuread_client_config" "current" {}

# Resource group
resource "azurerm_resource_group" "aks" {
  name     = "rbac-poc-rg"
  location = "eastus"

  tags = {
    Environment = "POC"
    Purpose     = "RBAC-Demo"
  }
}

# AKS cluster with Azure AD and Azure RBAC
resource "azurerm_kubernetes_cluster" "aks" {
  name                = "rbac-aks-cluster"
  location            = azurerm_resource_group.aks.location
  resource_group_name = azurerm_resource_group.aks.name
  dns_prefix          = "rbac-aks"
  kubernetes_version  = "1.28.5"

  default_node_pool {
    name       = "default"
    node_count = 2
    vm_size    = "Standard_D2s_v5"

    upgrade_settings {
      max_surge = "33%"
    }
  }

  identity {
    type = "SystemAssigned"
  }

  # Azure AD integration
  azure_active_directory_role_based_access_control {
    managed                = true
    azure_rbac_enabled     = true
    tenant_id              = data.azurerm_client_config.current.tenant_id
  }

  network_profile {
    network_plugin = "azure"
    network_policy = "calico"
  }

  tags = {
    Environment = "POC"
  }
}

# Outputs
output "cluster_id" {
  value = azurerm_kubernetes_cluster.aks.id
}

output "cluster_fqdn" {
  value = azurerm_kubernetes_cluster.aks.fqdn
}

output "kube_config_command" {
  value = "az aks get-credentials --resource-group ${azurerm_resource_group.aks.name} --name ${azurerm_kubernetes_cluster.aks.name}"
}

Apply Terraform configuration:

cd terraform
terraform init
terraform plan
terraform apply -auto-approve

# Get cluster credentials
eval $(terraform output -raw kube_config_command)

Part 2: Azure AD User and Group Setup

Create Azure AD Groups

Here’s where we set up the identity foundation. I always recommend using groups instead of individual user assignments—it makes life so much easier as your team grows.

# Create Azure AD groups for different access levels
az ad group create \
  --display-name "AKS-Cluster-Admins" \
  --mail-nickname "aks-cluster-admins" \
  --description "Full administrative access to AKS cluster"

az ad group create \
  --display-name "AKS-Developers" \
  --mail-nickname "aks-developers" \
  --description "Developer access to development namespaces"

az ad group create \
  --display-name "AKS-Viewers" \
  --mail-nickname "aks-viewers" \
  --description "Read-only access to AKS cluster"

# Get group object IDs
ADMIN_GROUP_ID=$(az ad group show \
  --group "AKS-Cluster-Admins" \
  --query id -o tsv)

DEVELOPER_GROUP_ID=$(az ad group show \
  --group "AKS-Developers" \
  --query id -o tsv)

VIEWER_GROUP_ID=$(az ad group show \
  --group "AKS-Viewers" \
  --query id -o tsv)

echo "Admin Group ID: $ADMIN_GROUP_ID"
echo "Developer Group ID: $DEVELOPER_GROUP_ID"
echo "Viewer Group ID: $VIEWER_GROUP_ID"

Terraform Implementation

# terraform/aad-groups.tf
resource "azuread_group" "aks_admins" {
  display_name     = "AKS-Cluster-Admins"
  mail_nickname    = "aks-cluster-admins"
  security_enabled = true
  description      = "Full administrative access to AKS cluster"
}

resource "azuread_group" "aks_developers" {
  display_name     = "AKS-Developers"
  mail_nickname    = "aks-developers"
  security_enabled = true
  description      = "Developer access to development namespaces"
}

resource "azuread_group" "aks_viewers" {
  display_name     = "AKS-Viewers"
  mail_nickname    = "aks-viewers"
  security_enabled = true
  description      = "Read-only access to AKS cluster"
}

output "admin_group_id" {
  value = azuread_group.aks_admins.object_id
}

output "developer_group_id" {
  value = azuread_group.aks_developers.object_id
}

output "viewer_group_id" {
  value = azuread_group.aks_viewers.object_id
}

Add Test Users to Groups

# Get current user's object ID
USER_ID=$(az ad signed-in-user show --query id -o tsv)

# Add yourself to admin group for testing
az ad group member add \
  --group "AKS-Cluster-Admins" \
  --member-id $USER_ID

# Create test users (optional)
az ad user create \
  --display-name "Jane Developer" \
  --user-principal-name jane.developer@yourdomain.onmicrosoft.com \
  --password "TempPassword123!" \
  --force-change-password-next-sign-in true

JANE_ID=$(az ad user show \
  --id jane.developer@yourdomain.onmicrosoft.com \
  --query id -o tsv)

az ad group member add \
  --group "AKS-Developers" \
  --member-id $JANE_ID

Part 3: Azure RBAC Role Assignments

Built-in AKS Roles

Let’s talk about the roles Azure gives you out of the box. There are four built-in roles for AKS, and they cover most use cases:

RoleScopePermissions
Azure Kubernetes Service RBAC Cluster AdminCluster-wideFull access to all resources
Azure Kubernetes Service RBAC AdminNamespace or clusterAdmin access, can manage roles
Azure Kubernetes Service RBAC WriterNamespace or clusterRead/write access to most resources
Azure Kubernetes Service RBAC ReaderNamespace or clusterRead-only access

Assign Cluster-Wide Roles

# Get AKS cluster resource ID
CLUSTER_ID=$(az aks show \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --query id -o tsv)

# Assign cluster admin role to admin group
az role assignment create \
  --role "Azure Kubernetes Service RBAC Cluster Admin" \
  --assignee $ADMIN_GROUP_ID \
  --scope $CLUSTER_ID

# Assign reader role to viewer group
az role assignment create \
  --role "Azure Kubernetes Service RBAC Reader" \
  --assignee $VIEWER_GROUP_ID \
  --scope $CLUSTER_ID

Terraform Implementation

# terraform/rbac-assignments.tf

# Cluster admin role for admin group
resource "azurerm_role_assignment" "aks_cluster_admin" {
  scope                = azurerm_kubernetes_cluster.aks.id
  role_definition_name = "Azure Kubernetes Service RBAC Cluster Admin"
  principal_id         = azuread_group.aks_admins.object_id
}

# Reader role for viewer group
resource "azurerm_role_assignment" "aks_reader" {
  scope                = azurerm_kubernetes_cluster.aks.id
  role_definition_name = "Azure Kubernetes Service RBAC Reader"
  principal_id         = azuread_group.aks_viewers.object_id
}

Namespace-Scoped Role Assignments

Now here’s where it gets powerful. You can scope roles to specific namespaces, which is perfect for multi-tenant scenarios:

# Create namespaces
kubectl create namespace development
kubectl create namespace staging
kubectl create namespace production

# Assign developer group as RBAC Writer to development namespace
az role assignment create \
  --role "Azure Kubernetes Service RBAC Writer" \
  --assignee $DEVELOPER_GROUP_ID \
  --scope "$CLUSTER_ID/namespaces/development"

# Assign developer group as RBAC Reader to staging
az role assignment create \
  --role "Azure Kubernetes Service RBAC Reader" \
  --assignee $DEVELOPER_GROUP_ID \
  --scope "$CLUSTER_ID/namespaces/staging"

# No access to production namespace for developers

Terraform: Namespace-Scoped Assignments

# terraform/namespaces.tf

# Create namespaces
resource "kubernetes_namespace" "development" {
  metadata {
    name = "development"
    labels = {
      environment = "dev"
    }
  }
}

resource "kubernetes_namespace" "staging" {
  metadata {
    name = "staging"
    labels = {
      environment = "staging"
    }
  }
}

resource "kubernetes_namespace" "production" {
  metadata {
    name = "production"
    labels = {
      environment = "prod"
    }
  }
}

# Namespace-scoped role assignments
resource "azurerm_role_assignment" "dev_writer" {
  scope                = "${azurerm_kubernetes_cluster.aks.id}/namespaces/${kubernetes_namespace.development.metadata[0].name}"
  role_definition_name = "Azure Kubernetes Service RBAC Writer"
  principal_id         = azuread_group.aks_developers.object_id
}

resource "azurerm_role_assignment" "staging_reader" {
  scope                = "${azurerm_kubernetes_cluster.aks.id}/namespaces/${kubernetes_namespace.staging.metadata[0].name}"
  role_definition_name = "Azure Kubernetes Service RBAC Reader"
  principal_id         = azuread_group.aks_developers.object_id
}

Part 4: Permission Validation

Test Cluster Admin Access

Time to validate that everything works. This is the fun part—seeing your RBAC setup in action.

# Get non-admin credentials
az aks get-credentials \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --overwrite-existing

# Test cluster-wide access (as admin group member)
kubectl auth can-i get nodes
# Expected: yes

kubectl auth can-i create namespace
# Expected: yes

kubectl auth can-i delete deployment -n production
# Expected: yes (cluster admin has all permissions)

Test Developer Access

Let’s simulate what a developer (Jane) can do. This is where you’ll see the power of namespace-scoped permissions:

# Get credentials as Jane
az login --username jane.developer@yourdomain.onmicrosoft.com

az aks get-credentials \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --overwrite-existing

# Test development namespace access (writer role)
kubectl auth can-i create deployment -n development
# Expected: yes

kubectl auth can-i get pods -n development
# Expected: yes

kubectl auth can-i delete pod -n development
# Expected: yes

# Test staging namespace access (reader role)
kubectl auth can-i get deployments -n staging
# Expected: yes

kubectl auth can-i create deployment -n staging
# Expected: no

# Test production namespace access (no role)
kubectl auth can-i get pods -n production
# Expected: no

# Test cluster-wide operations
kubectl auth can-i get nodes
# Expected: no

kubectl auth can-i create namespace
# Expected: no

Test Viewer Access

# As viewer group member
kubectl auth can-i get pods --all-namespaces
# Expected: yes

kubectl auth can-i get deployments -n development
# Expected: yes

kubectl auth can-i create deployment -n development
# Expected: no

kubectl auth can-i delete service -n staging
# Expected: no

Comprehensive Permission Audit Script

#!/bin/bash
# audit-permissions.sh

NAMESPACES=("development" "staging" "production")
RESOURCES=("pods" "deployments" "services" "configmaps" "secrets")
VERBS=("get" "list" "create" "update" "delete")

echo "=== RBAC Permission Audit ==="
echo "User: $(az account show --query user.name -o tsv)"
echo ""

for ns in "${NAMESPACES[@]}"; do
  echo "Namespace: $ns"
  echo "----------------------------------------"

  for resource in "${RESOURCES[@]}"; do
    echo -n "$resource: "
    for verb in "${VERBS[@]}"; do
      result=$(kubectl auth can-i $verb $resource -n $ns 2>/dev/null)
      if [ "$result" == "yes" ]; then
        echo -n "$verb "
      fi
    done
    echo ""
  done
  echo ""
done

echo "Cluster-wide permissions:"
echo "----------------------------------------"
kubectl auth can-i get nodes && echo "✓ Get nodes" || echo "✗ Get nodes"
kubectl auth can-i create namespace && echo "✓ Create namespace" || echo "✗ Create namespace"
kubectl auth can-i get clusterroles && echo "✓ Get cluster roles" || echo "✗ Get cluster roles"

Run audit:

chmod +x audit-permissions.sh
./audit-permissions.sh

Part 5: Custom Role Definitions

Sometimes the built-in roles don’t quite fit your needs. That’s where custom roles come in. I’ve found this especially useful when you need to deny access to secrets while allowing everything else.

Custom Role Permissions Comparison

Permission CategoryBuilt-in AdminBuilt-in WriterBuilt-in ReaderCustom Namespace Developer
Pods✅ All✅ Read/Write✅ Read only✅ Read + Logs
Deployments✅ All✅ Read/Write✅ Read only✅ Full control
Services✅ All✅ Read/Write✅ Read only✅ Full control
ConfigMaps✅ All✅ Read/Write✅ Read only✅ Full control
Secrets✅ All✅ Read/Write✅ Read only❌ No access
Namespaces✅ All❌ No✅ Read only❌ No access
Nodes✅ All❌ No✅ Read only❌ No access
RBAC/Roles✅ All❌ No✅ Read only❌ No access
PVs/PVCs✅ All✅ Read/Write✅ Read only❌ No access
ScopeCluster-wideNamespace or ClusterNamespace or ClusterNamespace only

Custom Role Benefits:

Why bother with custom roles? Here’s what I’ve found they’re good for:

  • Fine-grained permissions tailored to your specific team workflows
  • Explicitly denying access to sensitive resources (like secrets)
  • Restricting cluster-wide operations
  • Meeting specific audit and compliance requirements

Custom Role: Namespace Developer

// custom-role-namespace-developer.json
{
  "Name": "AKS Namespace Developer",
  "Description": "Can manage deployments, services, and configmaps in assigned namespaces",
  "Actions": [],
  "NotActions": [],
  "DataActions": [
    "Microsoft.ContainerService/managedClusters/apps/deployments/read",
    "Microsoft.ContainerService/managedClusters/apps/deployments/write",
    "Microsoft.ContainerService/managedClusters/apps/deployments/delete",
    "Microsoft.ContainerService/managedClusters/services/read",
    "Microsoft.ContainerService/managedClusters/services/write",
    "Microsoft.ContainerService/managedClusters/services/delete",
    "Microsoft.ContainerService/managedClusters/configmaps/read",
    "Microsoft.ContainerService/managedClusters/configmaps/write",
    "Microsoft.ContainerService/managedClusters/pods/read",
    "Microsoft.ContainerService/managedClusters/pods/log/read"
  ],
  "NotDataActions": [
    "Microsoft.ContainerService/managedClusters/secrets/*"
  ],
  "AssignableScopes": [
    "/subscriptions/{subscription-id}"
  ]
}

Create custom role:

az role definition create \
  --role-definition custom-role-namespace-developer.json

# Get custom role ID
CUSTOM_ROLE_ID=$(az role definition list \
  --name "AKS Namespace Developer" \
  --query [0].id -o tsv)

# Assign custom role
az role assignment create \
  --role "AKS Namespace Developer" \
  --assignee $DEVELOPER_GROUP_ID \
  --scope "$CLUSTER_ID/namespaces/development"

Terraform: Custom Role

# terraform/custom-roles.tf

resource "azurerm_role_definition" "namespace_developer" {
  name        = "AKS Namespace Developer"
  scope       = data.azurerm_subscription.current.id
  description = "Can manage deployments, services, and configmaps in assigned namespaces"

  permissions {
    actions     = []
    not_actions = []

    data_actions = [
      "Microsoft.ContainerService/managedClusters/apps/deployments/read",
      "Microsoft.ContainerService/managedClusters/apps/deployments/write",
      "Microsoft.ContainerService/managedClusters/apps/deployments/delete",
      "Microsoft.ContainerService/managedClusters/services/read",
      "Microsoft.ContainerService/managedClusters/services/write",
      "Microsoft.ContainerService/managedClusters/services/delete",
      "Microsoft.ContainerService/managedClusters/configmaps/read",
      "Microsoft.ContainerService/managedClusters/configmaps/write",
      "Microsoft.ContainerService/managedClusters/pods/read",
      "Microsoft.ContainerService/managedClusters/pods/log/read",
    ]

    not_data_actions = [
      "Microsoft.ContainerService/managedClusters/secrets/*",
    ]
  }

  assignable_scopes = [
    data.azurerm_subscription.current.id
  ]
}

Part 6: Troubleshooting

I’ve debugged these issues more times than I can count. Let me save you some pain.

Issue 1: “Error: You must be logged in to the server (Unauthorized)”

Cause: You’re either not authenticated with Azure AD, or your credentials expired.

Fix:

# Re-authenticate
az login

# Get fresh credentials
az aks get-credentials \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --overwrite-existing

# Verify authentication
kubectl get nodes

Issue 2: “Error: pods is forbidden: User cannot list resource”

Cause: This one’s straightforward—you don’t have the right Azure RBAC permissions.

Diagnosis:

# Check your role assignments
az role assignment list \
  --assignee $(az ad signed-in-user show --query id -o tsv) \
  --scope $CLUSTER_ID \
  --output table

# Check what you can do
kubectl auth can-i --list

Fix: Assign appropriate role or add user to correct Azure AD group.

Issue 3: Role Propagation Delay

Symptom: You just assigned a role, but permissions aren’t working yet. Sound familiar?

Fix: Here’s the thing—Azure RBAC can take up to 5 minutes to propagate. I know it’s frustrating, but patience is key here.

# Clear kubectl cache
rm -rf ~/.kube/cache

# Wait 5 minutes, then test again
kubectl auth can-i get pods -n development

Issue 4: Cluster Admin vs. Local Admin Confusion

Understanding:

This one trips up a lot of people. Here’s the key difference:

  • --admin flag bypasses Azure RBAC entirely (uses local cluster admin certificate)
  • Regular credentials use Azure AD + Azure RBAC

Best Practice:

# For normal operations (use Azure RBAC)
az aks get-credentials \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME

# For emergency/troubleshooting only
az aks get-credentials \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --admin

Debugging Tools

# View kubeconfig contexts
kubectl config get-contexts

# View current context details
kubectl config view --minify

# Check Azure AD token
kubectl get --raw /api/v1/namespaces/default/pods --v=9 | grep Authorization

# Validate role assignments
az role assignment list \
  --scope $CLUSTER_ID \
  --include-inherited \
  --output table

Part 7: Best Practices

Let me share some best practices I’ve learned the hard way.

1. Least Privilege Principle

This is security 101, but I still see people getting it wrong:

# Bad: Granting cluster admin to all developers
az role assignment create \
  --role "Azure Kubernetes Service RBAC Cluster Admin" \
  --assignee $DEVELOPER_GROUP_ID \
  --scope $CLUSTER_ID

# Good: Namespace-scoped writer role
az role assignment create \
  --role "Azure Kubernetes Service RBAC Writer" \
  --assignee $DEVELOPER_GROUP_ID \
  --scope "$CLUSTER_ID/namespaces/development"

2. Use Azure AD Groups, Not Individual Users

I can’t stress this enough—always use groups, never assign roles to individual users:

# Bad: Assigning roles to individual users
az role assignment create \
  --role "Azure Kubernetes Service RBAC Reader" \
  --assignee jane.developer@example.com \
  --scope $CLUSTER_ID

# Good: Assigning roles to groups
az role assignment create \
  --role "Azure Kubernetes Service RBAC Reader" \
  --assignee $VIEWER_GROUP_ID \
  --scope $CLUSTER_ID

3. Separate Production Access

# Create dedicated production admin group
resource "azuread_group" "prod_admins" {
  display_name     = "AKS-Production-Admins"
  security_enabled = true
}

# Require conditional access policy for production
# (configured in Azure AD portal)

4. Enable Audit Logging

# Enable diagnostic settings for AKS
az monitor diagnostic-settings create \
  --name aks-audit-logs \
  --resource $CLUSTER_ID \
  --logs '[
    {
      "category": "kube-audit",
      "enabled": true
    },
    {
      "category": "kube-audit-admin",
      "enabled": true
    }
  ]' \
  --workspace /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/aks-logs

5. Regular Access Reviews

Don’t set it and forget it. I recommend quarterly access reviews at minimum:

# Script: review-rbac-assignments.sh
#!/bin/bash

echo "=== AKS RBAC Access Review ==="
echo "Date: $(date)"
echo ""

az role assignment list \
  --scope $CLUSTER_ID \
  --include-inherited \
  --query "[].{Principal:principalName, Role:roleDefinitionName, Scope:scope}" \
  --output table

Cleanup

# Delete role assignments
az role assignment delete \
  --role "Azure Kubernetes Service RBAC Cluster Admin" \
  --assignee $ADMIN_GROUP_ID \
  --scope $CLUSTER_ID

# Delete AKS cluster
az aks delete \
  --resource-group $RESOURCE_GROUP \
  --name $CLUSTER_NAME \
  --yes --no-wait

# Delete resource group (includes all resources)
az group delete \
  --name $RESOURCE_GROUP \
  --yes --no-wait

# Or with Terraform
terraform destroy -auto-approve

Conclusion

If you’ve followed along, you now have a solid foundation for Azure RBAC in AKS. This POC covered:

  • Centralized identity management via Azure AD
  • Granular permissions at cluster and namespace levels
  • Built-in and custom role definitions
  • Validation and troubleshooting techniques (the real-world stuff)
  • Production-ready best practices

Next Steps:

Don’t stop here. Once you’ve got the basics working, I recommend:

  • Implementing conditional access policies for production clusters
  • Integrating with Azure PIM for just-in-time access
  • Setting up automated RBAC reviews and alerts
  • Extending this RBAC model to multi-cluster environments

The setup might seem like a lot of work upfront, but trust me—having proper access control from day one is so much easier than trying to retrofit it later. I’ve seen both scenarios, and you don’t want to be in the second one.


Additional Resources


About StriveNimbus: We specialize in Kubernetes security architecture, zero-trust implementations, and compliance automation for Azure environments. Our team helps organizations implement secure, scalable access control patterns. Contact us for security assessment and implementation support.