How to use Cluster API to programmatically scale and upgrade Kubernetes clusters

Cluster API is an open source tool for programmatically configuring, provisioning, and upgrading one or more Kubernetes clusters. That might sound simple, on its face, but Cluster API unlocks extraordinary power and flexibility for tasks like automating cluster management or provisioning and updating clusters at scale.
In our previous tutorial, we discussed why you might use Cluster API and how to deploy a simple cluster on AWS. In this installment, we’ll dig into how Cluster API can help you upgrade and scale multiple clusters at once.
This tutorial assumes that you’re comfortable with the fundamentals. If you need to brush up or familiarize yourself with Cluster API basics, read “How to Use Cluster API to Programmatically Configure and Deploy Kubernetes Clusters” first.
Why use Cluster API?
Kubernetes is a resource-aware and extensible system, and those two facts lay at the heart of its power and promise.
A resource-aware system spanning many different machines can determine whether it is over- or under-provisioned for the workloads assigned to it. Kubernetes can autoscale to meet demand out of the box, adding or removing nodes as needed.
But we don’t have to stop there. The system’s API-driven extensibility means that tools like Cluster API can interface at the cluster-level and manage scaling configurations across many clusters at once—just as it can handle tasks like upgrades en masse.
All of this makes it possible to easily optimize utilization—minimizing cost and energy consumption across even a sprawling multi-cluster system.
Initializing the management cluster for Azure
This tutorial will use a local management cluster, and we will assume you have the clusterctl command-line tool installed. (If you need to install clusterctl, you can review the previous walkthrough.)
Last time we built our worker cluster on AWS. In the interest of diversifying our experience, this time we will deploy a cluster to Microsoft Azure—all ultimately using the same abstraction layer of Cluster API.
In order to work with Azure, you will need two prerequisites:
An Azure account with the following resource providers registered:
Microsoft.Compute
Microsoft.Network
Microsoft.ContainerService
Microsoft.ManagedIdentity
Microsoft.Authorization
Note: Following this walkthrough will incur expenses on Azure.
First let’s set some environment variables defining very basic specifications for the worker cluster we’ll be creating:
% export CLUSTER_NAME="azure-worker"
% export WORKER_MACHINE_COUNT=3
% export KUBERNETES_VERSION="v1.24.6"
Now we need to create a new Azure Service Principal. We’ll use the Azure CLI tool to grab the subscription ID for the subscription we want to use and export it to an environment variable:
% az login
% az account list -o table
% az account set -s <SubscriptionId>
% export AZURE_SUBSCRIPTION_ID="<SubscriptionId>"
Next we’ll define environment variables for our desired region (one where our subscription has quota) and a resource group to be dedicated to our new cluster:
% export AZURE_LOCATION="eastus"
% export AZURE_RESOURCE_GROUP="${CLUSTER_NAME}"
Now we’ll actually create our Service Principal:
% az ad sp create-for-rbac --role contributor --scopes="/subscriptions/${AZURE_SUBSCRIPTION_ID}" --sdk-auth > sp.json
Next we’ll export several variables from the JSON file we created in the line above. (You may need to install the jq JSON-parsing command line tool at this point, if it’s not installed already.)
% export AZURE_SUBSCRIPTION_ID="$(cat sp.json | jq -r .subscriptionId | tr -d '\n')"
% export AZURE_CLIENT_SECRET="$(cat sp.json | jq -r .clientSecret | tr -d '\n')"
% export AZURE_CLIENT_ID="$(cat sp.json | jq -r .clientId | tr -d '\n')"
% export AZURE_TENANT_ID="$(cat sp.json | jq -r .tenantId | tr -d '\n')"
Now we’ll base-64 encode our credentials ...
% export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "$AZURE_SUBSCRIPTION_ID" | base64 | tr -d '\n')"
% export AZURE_TENANT_ID_B64="$(echo -n "$AZURE_TENANT_ID" | base64 | tr -d '\n')"
% export AZURE_CLIENT_ID_B64="$(echo -n "$AZURE_CLIENT_ID" | base64 | tr -d '\n')"
% export AZURE_CLIENT_SECRET_B64="$(echo -n "$AZURE_CLIENT_SECRET" | base64 | tr -d '\n')"
... and then configure critical details like our machine types:
% export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_D2s_v3"
% export AZURE_NODE_MACHINE_TYPE="Standard_D2s_v3"
% export AZURE_CLUSTER_IDENTITY_SECRET_NAME="cluster-identity-secret"
% export AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE="default"
% export CLUSTER_IDENTITY_NAME="cluster-identity"
On our local management cluster, we will create a secret to manage identity:
% kubectl create secret generic "${AZURE_CLUSTER_IDENTITY_SECRET_NAME}" --from-literal=clientSecret="${AZURE_CLIENT_SECRET}"
Finally, we’re ready to initialize the provider on our local management cluster, then create the new worker cluster on Azure:
% clusterctl init --infrastructure azure
% clusterctl generate cluster ${CLUSTER_NAME} --kubernetes-version ${KUBERNETES_VERSION} > cluster.yaml
% kubectl apply -f cluster.yaml
You can check the status of your new cluster with:
% kubectl get cluster-api -o wide
Our resources won’t reach a READY state until we set up a Container Network Interface (CNI). We can accomplish this by using clusterctl to download our new worker cluster’s kubeconfig…
% clusterctl get kubeconfig azure-worker > azure-worker.kubeconfig
…and then applying a Calico CNI implementation.
% kubectl --kubeconfig=./azure-worker.kubeconfig \
apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/calico.yaml
Once the Azure worker cluster is ready, you can proceed to the next step.
Understanding and using machine resources
As we saw in the previous tutorial, Cluster API treats Kubernetes clusters and their constituent machines as Kubernetes resources like any other. Just as we have resource types for pods and services, we now have a resource type for clusters.
We also have several resource types for managing the machines underlying nodes, including:
Machine
MachineSet
MachineDeployment
These are all infrastructure machine templates, and you can think of them as existing in tiers of ownership: Machines belong to MachineSets, which in turn belong to MachineDeployments.
Machine is an abstraction for the infrastructure underlying a Kubernetes node, whether that’s a VM, bare metal, or what have you. In terms of usage, you can think of this like a Pod: it is in some sense the atomic unit of Cluster API, but it’s not a resource you would prefer to manage directly.
In the same way that an application may belong to a ReplicaSet, a group of like machines may be managed by a MachineSet. These also are not meant to be managed directly, but instead serve as a resource that may be manipulated by MachineDeployments.
MachineDeployments are aptly named, because they are to Machines as Deployments are to Pods: the managing abstraction for infrastructure machines. Specifications via Cluster API are intended to function as much as possible by standard Kubernetes principles: declaratively and immutably. So when we make an update to the specification—upgrading the Kubernetes version on the machines, for example—the MachineDeployment controller handles the process of gracefully replacing existing machines under its purview with new machines that meet the spec.
This is important to note: from the standpoint of Cluster API, machines metaphorically contravene the first law of thermodynamics: they are only created or destroyed, and never change form.
Scaling via a MachineDeployment
Now let’s create a machine. If we run…
% kubectl get machines
…we should see that we have four machines at the moment, since we have a control plane machine and the three worker machines we specified in our initial setup.
NAME CLUSTER NODENAME PHASE AGE VERSION
azure-worker-control-plane-jd1d6 azure-worker azure-worker-control-plane-wmt26 Running 50m v1.24.6
azure-worker-md-0-598f9c756b-5dt68 azure-worker azure-worker-md-0-mjf4x Running 49m v1.24.6
azure-worker-md-0-598f9c756b-j6rrg azure-worker azure-worker-md-0-9mzvl Running 49m v1.24.6
azure-worker-md-0-598f9c756b-w2pmm azure-worker azure-worker-md-0-xkk2v Running 49m v1.24.6
Our worker machine names handily indicate that they belong to the azure-worker-md-0 MachineDeployment. So we can scale this imperatively with kubectl using a flag…
% kubectl scale machinedeployment azure-worker-md-0 --replicas=4
Now we can run kubectl get machines again…
NAME CLUSTER NODENAME PHASE AGE VERSION
azure-worker-control-plane-jd1d6 azure-worker azure-worker-control-plane-wmt26 Running 60m v1.24.6
azure-worker-md-0-598f9c756b-5dt68 azure-worker azure-worker-md-0-mjf4x Running 59m v1.24.6
azure-worker-md-0-598f9c756b-j6rrg azure-worker azure-worker-md-0-9mzvl Running 59m v1.24.6
azure-worker-md-0-598f9c756b-w2pmm azure-worker azure-worker-md-0-xkk2v Running 59m v1.24.6
azure-worker-md-0-598f9c756b-mxt59 azure-worker azure-worker-md-0-j9fds Running 10m v1.24.6
We have a new worker machine. This is, of course, a little quick-and-dirty. So for our next step, we’ll go about things more properly and declaratively, defining our specification in a manifest. Before we do, try scaling back down to three machines.
Upgrading via MachineDeployment
Let’s consider a slightly more challenging operation: upgrading the Kubernetes version across a worker cluster.
I’ve said that from Cluster API’s perspective, our machines are immutable. So what we’re really doing here is creating new machines with the desired Kubernetes version, then gracefully transitioning utilization from the old machines to the new and then deleting the originals.
Here we’re making a more consequential change, so let’s be good and do things declaratively. We’ll need to update the control plane machine first, followed by the worker machines, and the process will look similar for each.
First, we’ll download the current specification for the control plane resource with kubectl:
% kubectl get kubeadmcontrolplane azure-worker-control-plane -o yaml > control-plane.yaml
Make a copy of the file called control-plane-update.yaml and edit it so that the spec field looks as below. The primary changes you’ll be making to the manifest are:
Deleting all of the status fields
Changing what is now the final line, spec.version, to v1.24.10
spec: kubeadmConfigSpec: clusterConfiguration: apiServer: extraArgs: cloud-config: /etc/kubernetes/azure.json cloud-provider: azure extraVolumes: - hostPath: /etc/kubernetes/azure.json mountPath: /etc/kubernetes/azure.json name: cloud-config readOnly: true timeoutForControlPlane: 20m0s controllerManager: extraArgs: allocate-node-cidrs: "false" cloud-config: /etc/kubernetes/azure.json cloud-provider: azure cluster-name: azure-worker extraVolumes: - hostPath: /etc/kubernetes/azure.json mountPath: /etc/kubernetes/azure.json name: cloud-config readOnly: true dns: {} etcd: local: dataDir: /var/lib/etcddisk/etcd extraArgs: quota-backend-bytes: "8589934592" networking: {} scheduler: {} diskSetup: filesystems: - device: /dev/disk/azure/scsi1/lun0 extraOpts: - -E - lazy_itable_init=1,lazy_journal_init=1 filesystem: ext4 label: etcd_disk - device: ephemeral0.1 filesystem: ext4 label: ephemeral0 replaceFS: ntfs partitions: - device: /dev/disk/azure/scsi1/lun0 layout: true overwrite: false tableType: gpt files: - contentFrom: secret: key: control-plane-azure.json name: azure-worker-control-plane-azure-json owner: root:root path: /etc/kubernetes/azure.json permissions: "0644" format: cloud-config initConfiguration: localAPIEndpoint: {} nodeRegistration: kubeletExtraArgs: azure-container-registry-config: /etc/kubernetes/azure.json cloud-config: /etc/kubernetes/azure.json cloud-provider: azure name: '{{ ds.meta_data["local_hostname"] }}' joinConfiguration: discovery: {} nodeRegistration: kubeletExtraArgs: azure-container-registry-config: /etc/kubernetes/azure.json cloud-config: /etc/kubernetes/azure.json cloud-provider: azure name: '{{ ds.meta_data["local_hostname"] }}' mounts: - - LABEL=etcd_disk - /var/lib/etcddisk machineTemplate: infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: AzureMachineTemplate name: azure-worker-control-plane namespace: default metadata: {} replicas: 1 rolloutStrategy: rollingUpdate: maxSurge: 1 type: RollingUpdate version: v1.24.10
Now we’ll apply the manifest…
% kubectl apply -f control-plane-update.yaml
The system will begin to create a new replica of the control plane machine, then transfer responsibility to that replica once it is ready, and finally delete the old machine. The process will take a few minutes, and you can observe it with:
% kubectl get kubeadmcontrolplane azure-worker-control-plane
At first, the output will show two replicas with one unavailable. Eventually, you will see one replica with zero unavailable—running on v1.24.10.
NAME CLUSTER INITIALIZED API SERVER AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
azure-worker-control-plane azure-worker true true 1 1 1 0 35m v1.24.10
Now we’re going to do the same thing with the worker machines via a MachineDeployment. Download a manifest for that resource:
% kubectl get machinedeployment azure-worker-md-0 -o yaml > md-0.yaml
Make a copy of the YAML file called md-0-update.yaml. Now we’ll edit the new copy. Note, in addition to the version specification, the field for spec.strategy.type. This takes two options: RollingUpdate or OnDelete.
RollingUpdate will facilitate the behavior I described above—after application of the manifest, the system will create new machines and begin a graceful transition before deleting the original machines.
By contrast, OnDelete will wait for an operator (whether a human being or a software agent) to delete the original machine before provisioning a replacement.
We’ll use RollingUpdate and specify Kubernetes version 1.24.10. Update the spec to look like so by…
Deleting all of the status fields
Changing what is now the final line, spec.version, to v1.24.10
spec:
clusterName: azure-worker
minReadySeconds: 0
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 1
selector:
matchLabels:
cluster.x-k8s.io/cluster-name: azure-worker
cluster.x-k8s.io/deployment-name: azure-worker-md-0
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
cluster.x-k8s.io/cluster-name: azure-worker
cluster.x-k8s.io/deployment-name: azure-worker-md-0
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: azure-worker-md-0
clusterName: azure-worker
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
name: azure-worker-md-0
version: v1.24.10
Now we can apply the manifest to propagate our changes to the worker cluster:
% kubectl apply -f md-0-update.yaml
If you run kubectl get machines, you’ll see the system beginning to provision new machines running the new version of Kubernetes:
NAME CLUSTER NODENAME PHASE AGE VERSION
azure-worker-control-plane-6btkg azure-worker azure-worker-control-plane-wmt26 Running 29m v1.24.10
azure-worker-md-0-598f9c756b-5dt68 azure-worker azure-worker-md-0-mjf4x Running 59m v1.24.6
azure-worker-md-0-598f9c756b-j6rrg azure-worker azure-worker-md-0-9mzvl Running 59m v1.24.6
azure-worker-md-0-598f9c756b-w2pmm azure-worker azure-worker-md-0-xkk2v Running 59m v1.24.6
azure-worker-md-0-6c64c9db4c-sc5vf azure-worker Provisioning 71s v1.24.10
After a couple of minutes, we’ll see old worker machines starting to be replaced by the newly provisioned ones:
NAME CLUSTER NODENAME PHASE AGE VERSION
azure-worker-control-plane-6btkg azure-worker azure-worker-control-plane-wmt26 Running 32m v1.24.10
azure-worker-md-0-598f9c756b-j6rrg azure-worker azure-worker-md-0-9mzvl Running 61m v1.24.6
azure-worker-md-0-598f9c756b-w2pmm azure-worker azure-worker-md-0-xkk2v Running 61m v1.24.6
azure-worker-md-0-6c64c9db4c-65999 azure-worker Provisioning 9s v1.24.10
azure-worker-md-0-6c64c9db4c-sc5vf azure-worker azure-worker-md-0-fnkcn Running 3m36s v1.24.10
Above, we see that machine sc5vf—now launched with v.1.24.10—is up and running and has replaced machine 5dt68. A new machine running the upgraded version has started to provision. In a few minutes more…
NAME CLUSTER NODENAME PHASE AGE VERSION
azure-worker-control-plane-6btkg azure-worker azure-worker-control-plane-wmt26 Running 48m v1.24.10
azure-worker-md-0-6c64c9db4c-65999 azure-worker azure-worker-md-0-lndqb Running 16m v1.24.10
azure-worker-md-0-6c64c9db4c-b975n azure-worker azure-worker-md-0-vb44l Running 13m v1.24.10
azure-worker-md-0-6c64c9db4c-sc5vf azure-worker azure-worker-md-0-fnkcn Running 19m v1.24.10
Our cluster is completely upgraded!
When you’re done playing around with your fancy new cluster, you can clean up with:
% kubectl delete cluster azure-worker
% kubectl delete azureclusteridentity cluster-identity
Conclusion and next steps
So far, so good! We’ve learned…
How Cluster API works
How to deploy clusters to two different infrastructure providers with Cluster API
How to perform basic tasks like scaling and upgrading machines on clusters managed by Cluster API
In the final tutorial of this series, we’ll explore how to unlock the real power of Cluster API by writing a simple application that uses the API to intelligently manage infrastructure across multiple providers.