Reducing AKS Compute Costs by 25% with Karpenter and Node Auto Provisioning
How StriveNimbus helped a financial services client achieve 25% cost reduction in AKS by replacing static node pools with Karpenter-based dynamic provisioning and spot instances.
Executive Summary
A mid-sized financial services company came to us with a problem I see all the time: they were running critical applications on Azure Kubernetes Service (AKS) and hemorrhaging money due to over-provisioned static node pools. We implemented Karpenter with Azure’s Node Auto Provisioning (NAP) capabilities and helped them achieve a 25% reduction in monthly compute costs—that’s roughly $18,000/month in savings—while actually improving their scaling performance and resource utilization.
Key Outcomes:
- 25% reduction in AKS compute costs ($72K → $54K monthly)
- Average CPU utilization improved from 38% to 67%
- Node scaling time reduced from 5-8 minutes to 45-90 seconds
- Zero application disruption during migration
- Established foundation for spot instance adoption (additional 60-70% savings on eligible workloads)
Client Background
Industry: Financial Services
AKS Workload: Trading analytics platform, risk management systems, customer portals
Infrastructure Scale:
- 3 AKS clusters (dev, staging, production)
- 45-60 nodes across production
- ~800 pods during peak hours
- Mixed workload types (batch jobs, APIs, data processing)
Pain Points:
The classic over-provisioning trap:
- Static node pools sized for peak loads (running 24/7)
- Low average CPU utilization (35-40%)
- Scaling events taking 5-8 minutes (way too slow)
- Monthly AKS compute spend: $72,000
- Capacity planning required manual intervention and guesswork
Baseline Architecture
Initial Setup: Static Node Pools
When we first looked at their setup, it was textbook over-provisioning. Their architecture relied entirely on manually configured node pools:
# Terraform configuration - baseline setup
resource "azurerm_kubernetes_cluster" "prod" {
name = "prod-aks-cluster"
resource_group_name = azurerm_resource_group.aks.name
location = "eastus"
kubernetes_version = "1.27.7"
default_node_pool {
name = "system"
vm_size = "Standard_D4s_v5"
node_count = 3
enable_auto_scaling = true
min_count = 3
max_count = 5
}
}
# User node pools - over-provisioned for peak capacity
resource "azurerm_kubernetes_cluster_node_pool" "apps" {
name = "apps"
kubernetes_cluster_id = azurerm_kubernetes_cluster.prod.id
vm_size = "Standard_D8s_v5"
enable_auto_scaling = true
min_count = 12 # Provisioned for peak load
max_count = 20
node_labels = {
workload = "applications"
}
}
resource "azurerm_kubernetes_cluster_node_pool" "batch" {
name = "batch"
kubernetes_cluster_id = azurerm_kubernetes_cluster.prod.id
vm_size = "Standard_F16s_v2" # Compute-optimized
enable_auto_scaling = true
min_count = 8
max_count = 15
node_labels = {
workload = "batch-processing"
}
}
Observed Metrics (Baseline - 30-day average):
| Metric | Value |
|---|---|
| Total nodes (avg) | 48 |
| Total nodes (peak) | 62 |
| CPU utilization (avg) | 38% |
| Memory utilization (avg) | 42% |
| Monthly compute cost | $72,000 |
| Scale-up time (p95) | 7.2 minutes |
| Wasted capacity | ~$27,000/month |
Solution Architecture: Karpenter with Node Auto Provisioning
Karpenter Overview
If you haven’t worked with Karpenter before, think of it as a smarter cluster autoscaler. Instead of scaling pre-defined node pools, it provisions just-in-time compute resources based on what your pods actually need.
Key Capabilities:
- Pod-driven provisioning: Launches nodes based on pending pod requirements
- Bin-packing optimization: Efficiently places pods to minimize node count
- Diverse instance selection: Chooses optimal VM sizes from allowed types
- Consolidation: Automatically removes underutilized nodes
- Fast provisioning: Directly calls Azure APIs (bypasses node pool scaling)
What surprised me most when I first used Karpenter was how much faster it provisions nodes compared to traditional cluster autoscaling. We’re talking 45-90 seconds instead of 5-8 minutes.
Architecture Diagram
graph TD
subgraph AKS["AKS Cluster"]
Pods["Pending Pods
(Unschedulable)"]
Karpenter["Karpenter Controller
(Watches scheduler)"]
NodePool["NodePool CRD
• VM families
• Spot/On-demand
• Constraints"]
Pods -->|"1. Scheduling fails"| Karpenter
Karpenter -->|"2. Evaluates
requirements"| NodePool
end
NodePool -->|"3. Selects optimal
instance type"| Azure["Azure APIs
• VM provisioning
• Spot allocation"]
Azure -->|"4. Provisions"| Nodes["Provisioned Nodes
• D4s_v5, D8s_v5
• F8s_v2, F16s_v2
• Spot instances"]
Nodes -.->|"5. Pods scheduled"| Pods
Karpenter -.->|"Consolidation"| Nodes
Karpenter -.->|"Deprovisioning"| Nodes
style AKS fill:#e1f5ff
style Azure fill:#fff3cd
style Nodes fill:#d4edda
style Pods fill:#f8d7da
Implementation Approach
Phase 1: Karpenter Installation (Week 1)
Install Karpenter via Helm
# Add Karpenter Helm repository
helm repo add karpenter https://charts.karpenter.sh
helm repo update
# Create karpenter namespace
kubectl create namespace karpenter
# Install Karpenter
helm install karpenter karpenter/karpenter \
--namespace karpenter \
--set controller.clusterName=prod-aks-cluster \
--set controller.clusterEndpoint=$(az aks show \
--resource-group production-rg \
--name prod-aks-cluster \
--query fqdn -o tsv) \
--set serviceAccount.create=true \
--version 0.31.1
Terraform Implementation
# Karpenter installation via Helm
resource "helm_release" "karpenter" {
name = "karpenter"
repository = "https://charts.karpenter.sh"
chart = "karpenter"
namespace = kubernetes_namespace.karpenter.metadata[0].name
version = "0.31.1"
set {
name = "controller.clusterName"
value = azurerm_kubernetes_cluster.prod.name
}
set {
name = "controller.clusterEndpoint"
value = azurerm_kubernetes_cluster.prod.fqdn
}
set {
name = "serviceAccount.annotations.azure\\.workload\\.identity/client-id"
value = azurerm_user_assigned_identity.karpenter.client_id
}
}
# Managed identity for Karpenter
resource "azurerm_user_assigned_identity" "karpenter" {
name = "karpenter-identity"
resource_group_name = azurerm_resource_group.aks.name
location = azurerm_resource_group.aks.location
}
# Grant permissions to manage VMs
resource "azurerm_role_assignment" "karpenter_vm_contributor" {
scope = azurerm_resource_group.aks.id
role_definition_name = "Virtual Machine Contributor"
principal_id = azurerm_user_assigned_identity.karpenter.principal_id
}
Phase 2: NodePool Configuration (Week 2)
We created three Karpenter NodePool CRDs optimizing for different workload types:
General Purpose NodePool
# karpenter-nodepool-general.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: general-purpose
spec:
# Template for provisioned nodes
template:
spec:
nodeClassRef:
name: default
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"] # Start with on-demand for stability
- key: node.kubernetes.io/instance-type
operator: In
values:
- Standard_D4s_v5
- Standard_D8s_v5
- Standard_D16s_v5
taints: []
labels:
workload: general
# Limits
limits:
cpu: "200"
memory: 800Gi
# Consolidation settings
disruption:
consolidationPolicy: WhenUnderutilized
consolidateAfter: 30s
expireAfter: 720h # 30 days
# Weight for prioritization
weight: 10
Compute-Optimized NodePool (Batch Workloads)
# karpenter-nodepool-compute.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: compute-optimized
spec:
template:
spec:
nodeClassRef:
name: default
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"] # Use spot for batch workloads
- key: node.kubernetes.io/instance-type
operator: In
values:
- Standard_F8s_v2
- Standard_F16s_v2
- Standard_F32s_v2
taints:
- key: workload
value: batch
effect: NoSchedule
labels:
workload: batch
limits:
cpu: "500"
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 60s
expireAfter: 168h # 7 days for short-lived batch jobs
weight: 20
Memory-Optimized NodePool
# karpenter-nodepool-memory.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: memory-optimized
spec:
template:
spec:
nodeClassRef:
name: default
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values:
- Standard_E4s_v5
- Standard_E8s_v5
- Standard_E16s_v5
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
taints:
- key: workload
value: memory-intensive
effect: NoSchedule
labels:
workload: memory
limits:
memory: 1000Gi
disruption:
consolidationPolicy: WhenUnderutilized
consolidateAfter: 120s
weight: 15
NodeClass Configuration
# karpenter-nodeclass.yaml
apiVersion: karpenter.azure.com/v1alpha2
kind: AKSNodeClass
metadata:
name: default
spec:
imageFamily: Ubuntu2204 # OS image
osDiskSizeGB: 128
# Subnet for node provisioning
subnetID: /subscriptions/{sub-id}/resourceGroups/production-rg/providers/Microsoft.Network/virtualNetworks/aks-vnet/subnets/aks-nodes
# Tags for cost allocation
tags:
ManagedBy: Karpenter
Environment: Production
CostCenter: Engineering
Phase 3: Workload Migration (Weeks 3-4)
Here’s where patience pays off. We didn’t rush this—we migrated workloads gradually to validate Karpenter behavior and catch any issues early:
Step 1: Add Node Affinity to Deployments
# Example: Migrate batch processing workload
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
replicas: 10
template:
spec:
# Tolerate Karpenter-provisioned nodes
tolerations:
- key: workload
operator: Equal
value: batch
effect: NoSchedule
# Prefer Karpenter nodes
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: karpenter.sh/nodepool
operator: In
values:
- compute-optimized
containers:
- name: processor
image: batch-processor:v2.1.0
resources:
requests:
cpu: "2"
memory: 4Gi
limits:
cpu: "4"
memory: 8Gi
Step 2: Cordon Static Node Pool
# Gradually drain static node pool
kubectl cordon -l agentpool=batch
# Monitor pod migration to Karpenter nodes
watch kubectl get pods -o wide --field-selector=status.phase=Running
Step 3: Validate and Scale Down
# After successful migration, reduce static node pool min count
az aks nodepool update \
--resource-group production-rg \
--cluster-name prod-aks-cluster \
--name batch \
--min-count 0 \
--max-count 5
# Eventually delete static pool
az aks nodepool delete \
--resource-group production-rg \
--cluster-name prod-aks-cluster \
--name batch
Phase 4: Spot Instance Integration (Week 5)
Once we had Karpenter running smoothly, we took it a step further by enabling spot instances for fault-tolerant workloads:
# Update NodePool to allow spot
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: compute-optimized
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"] # Allow both
Spot Instance Adoption Strategy:
# Add toleration for spot interruption
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor-spot
spec:
template:
spec:
tolerations:
- key: karpenter.sh/capacity-type
operator: Equal
value: spot
effect: NoSchedule
# Handle spot interruptions gracefully
terminationGracePeriodSeconds: 120
containers:
- name: processor
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "save-checkpoint.sh"]
Results and Impact
Cost Savings Breakdown
Let’s talk numbers. The results were better than we initially projected:
Before Karpenter:
- Base node pools: 48 nodes × $0.192/hour (D8s_v5 avg) = $9.22/hour
- Peak capacity: 62 nodes × $0.192/hour = $11.90/hour
- Average monthly cost: $72,000
After Karpenter (On-Demand):
- Average nodes: 35 (better bin-packing)
- Average hourly cost: $6.72/hour (smaller instance mix)
- Monthly cost: $54,000
- Savings: $18,000/month (25%)
With Spot Instances (35% of workload):
- Spot discount: ~70% on eligible workloads
- Additional savings: $8,500/month
- Total monthly cost: $45,500
- Total savings: $26,500/month (37%)
That’s real money that went straight back into the business.
Performance Metrics
| Metric | Before Karpenter | After Karpenter | Improvement |
|---|---|---|---|
| Avg CPU utilization | 38% | 67% | +76% efficiency |
| Avg memory utilization | 42% | 71% | +69% efficiency |
| Node count (avg) | 48 | 35 | -27% nodes |
| Scale-up time (p95) | 7.2 min | 1.5 min | 79% faster |
| Monthly cost | $72,000 | $54,000 | -25% cost |
| Wasted capacity | $27,000/month | $8,000/month | -70% waste |
Observability Improvements
We implemented comprehensive monitoring using Prometheus and Azure Monitor:
# Prometheus ServiceMonitor for Karpenter metrics
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: karpenter
namespace: karpenter
spec:
selector:
matchLabels:
app.kubernetes.io/name: karpenter
endpoints:
- port: http-metrics
interval: 30s
Key Metrics Tracked:
# Node provisioning latency
histogram_quantile(0.95,
rate(karpenter_provisioner_scheduling_duration_seconds_bucket[5m])
)
# Consolidation savings
sum(karpenter_deprovisioning_actions_performed_total)
# Pending pod count
sum(karpenter_provisioner_pending_pods_total)
# Cost per workload (custom metric)
sum by (workload) (
node_cpu_hourly_cost * on(node) group_left(workload)
kube_node_labels{label_workload!=""}
)
Azure Monitor Dashboard:
# Create custom dashboard
az monitor metrics list \
--resource /subscriptions/{sub}/resourceGroups/production-rg/providers/Microsoft.ContainerService/managedClusters/prod-aks-cluster \
--metric "node_cpu_usage_percentage" "node_memory_usage_percentage"
Lessons Learned
Let me share some hard-won lessons from this implementation.
1. Gradual Migration is Critical
Learning: I can’t emphasize this enough—switching all workloads to Karpenter at once is asking for trouble.
Approach: We migrated 10% of workloads weekly, validated metrics, then moved on to the next batch. Slow and steady wins the race here.
2. Right-Size Resource Requests
Challenge: Here’s something we discovered early on—over-requested CPU/memory completely defeated Karpenter’s bin-packing optimization. Garbage in, garbage out.
Solution: We analyzed actual resource usage via Vertical Pod Autoscaler (VPA) recommendations:
# Install VPA in recommendation mode
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-0.14.0/vpa-v0.14.0.yaml
# Create VPA for analysis
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updateMode: "Off" # Recommendation only
Reduced requests by 30% on average after VPA analysis.
3. Spot Interruption Handling
Challenge: In our initial spot instance rollout, about 5% of batch jobs failed due to spot interruptions. Not ideal.
Solution: We implemented proper checkpointing and retry logic:
# Python batch job with checkpointing
import signal
import sys
def save_checkpoint(state):
with open('/checkpoint/state.json', 'w') as f:
json.dump(state, f)
def signal_handler(sig, frame):
print('Spot interruption detected, saving checkpoint...')
save_checkpoint(current_state)
sys.exit(0)
signal.signal(signal.SIGTERM, signal_handler)
4. NodePool Limits Prevent Runaway Costs
Learning: This one’s important—always set conservative limits on Karpenter NodePools. I’ve seen runaway scaling events that would give any CFO a heart attack.
Best Practice: Start with limits 20% above your peak observed capacity. You can always increase them later if needed.
5. Taints and Tolerations for Workload Isolation
Challenge: High-priority workloads competed for nodes with batch jobs.
Solution: Used taints/tolerations to isolate workload classes:
# Critical workload - dedicated nodes
tolerations:
- key: workload
operator: Equal
value: critical
effect: NoSchedule
Future Optimization Roadmap
Q1 2025: Multi-Zone Provisioning
Distribute nodes across availability zones for resilience:
spec:
template:
spec:
requirements:
- key: topology.kubernetes.io/zone
operator: In
values: ["eastus-1", "eastus-2", "eastus-3"]
Q2 2025: GPU Node Provisioning
Extend Karpenter to ML workloads:
spec:
template:
spec:
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values: ["Standard_NC6s_v3", "Standard_NC12s_v3"]
Q3 2025: FinOps Integration
Implement chargeback using Kubecost with Karpenter labels:
metadata:
labels:
cost-center: engineering
team: platform
project: trading-analytics
Conclusion
This project completely transformed how the client thinks about their AKS infrastructure. We reduced monthly compute spend by 25% while actually improving resource utilization and scaling performance—a rare win-win in cloud infrastructure.
The success really came down to a few key factors:
- Taking a gradual, phased migration approach (no big bang deployments)
- Comprehensive monitoring and validation at every step
- Right-sizing resource requests via VPA analysis (this was huge)
- Strategic use of spot instances for fault-tolerant workloads
- Proper workload isolation using taints/tolerations
What I’m most proud of is that the client now has a foundation that scales cost-effectively as their platform grows. They’re already talking about extending this to multi-zone provisioning and GPU workload support.
If you’re running static node pools and paying for capacity you don’t need, Karpenter is worth a serious look. The initial migration takes some work, but the ongoing savings and operational improvements make it worth every hour invested.
About StriveNimbus
StriveNimbus specializes in Kubernetes cost optimization, cloud-native architecture, and platform engineering for Azure environments. We help organizations reduce cloud spend while improving reliability and performance.
Ready to optimize your AKS costs? Contact us for a free cost assessment and optimization roadmap.
Technical Appendix
Karpenter vs. Cluster Autoscaler Comparison
| Feature | Cluster Autoscaler | Karpenter |
|---|---|---|
| Provisioning speed | 5-8 minutes | 45-90 seconds |
| Instance diversity | Pre-defined pools | Dynamic selection |
| Consolidation | Manual | Automatic |
| Spot integration | Limited | Native support |
| Bin-packing | Basic | Advanced |
Cost Calculation Methodology
# Monthly cost calculation script
def calculate_monthly_cost(node_count, vm_size, hours_per_month=730):
pricing = {
'Standard_D4s_v5': 0.192,
'Standard_D8s_v5': 0.384,
'Standard_F8s_v2': 0.355,
'Standard_F16s_v2': 0.710,
}
hourly_cost = node_count * pricing.get(vm_size, 0)
return hourly_cost * hours_per_month
# Before Karpenter
baseline_cost = calculate_monthly_cost(48, 'Standard_D8s_v5')
print(f"Baseline: ${baseline_cost:,.2f}")
# After Karpenter
optimized_cost = calculate_monthly_cost(35, 'Standard_D8s_v5') * 0.85 # Mixed sizes
print(f"Optimized: ${optimized_cost:,.2f}")
print(f"Savings: ${baseline_cost - optimized_cost:,.2f}")