How to use Cluster API to programmatically scale and upgrade Kubernetes clusters

Eric Gregory - February 10, 2023

Cluster API is an open source tool for programmatically configuring, provisioning, and upgrading one or more Kubernetes clusters. That might sound simple, on its face, but Cluster API unlocks extraordinary power and flexibility for tasks like automating cluster management or provisioning and updating clusters at scale.

In our previous tutorial, we discussed why you might use Cluster API and how to deploy a simple cluster on AWS. In this installment, we’ll dig into how Cluster API can help you upgrade and scale multiple clusters at once.

This tutorial assumes that you’re comfortable with the fundamentals. If you need to brush up or familiarize yourself with Cluster API basics, read “How to Use Cluster API to Programmatically Configure and Deploy Kubernetes Clusters” first.

Why use Cluster API?

Kubernetes is a resource-aware and extensible system, and those two facts lay at the heart of its power and promise.

A resource-aware system spanning many different machines can determine whether it is over- or under-provisioned for the workloads assigned to it. Kubernetes can autoscale to meet demand out of the box, adding or removing nodes as needed.

But we don’t have to stop there. The system’s API-driven extensibility means that tools like Cluster API can interface at the cluster-level and manage scaling configurations across many clusters at once—just as it can handle tasks like upgrades en masse.

All of this makes it possible to easily optimize utilization—minimizing cost and energy consumption across even a sprawling multi-cluster system.

Initializing the management cluster for Azure

This tutorial will use a local management cluster, and we will assume you have the clusterctl command-line tool installed. (If you need to install clusterctl, you can review the previous walkthrough.)

Last time we built our worker cluster on AWS. In the interest of diversifying our experience, this time we will deploy a cluster to Microsoft Azure—all ultimately using the same abstraction layer of Cluster API.

In order to work with Azure, you will need two prerequisites:

An Azure account with the following resource providers registered:
- Microsoft.Compute
- Microsoft.Network
- Microsoft.ContainerService
- Microsoft.ManagedIdentity
- Microsoft.Authorization
The Azure CLI tool

Note: Following this walkthrough will incur expenses on Azure.

First let’s set some environment variables defining very basic specifications for the worker cluster we’ll be creating:

% export CLUSTER_NAME="azure-worker"
% export WORKER_MACHINE_COUNT=3
% export KUBERNETES_VERSION="v1.24.6"

Now we need to create a new Azure Service Principal. We’ll use the Azure CLI tool to grab the subscription ID for the subscription we want to use and export it to an environment variable:

% az login
% az account list -o table
% az account set -s <SubscriptionId>
% export AZURE_SUBSCRIPTION_ID="<SubscriptionId>"

Next we’ll define environment variables for our desired region (one where our subscription has quota) and a resource group to be dedicated to our new cluster:

% export AZURE_LOCATION="eastus"
% export AZURE_RESOURCE_GROUP="${CLUSTER_NAME}"

Now we’ll actually create our Service Principal:

% az ad sp create-for-rbac --role contributor --scopes="/subscriptions/${AZURE_SUBSCRIPTION_ID}" --sdk-auth > sp.json

Next we’ll export several variables from the JSON file we created in the line above. (You may need to install the jq JSON-parsing command line tool at this point, if it’s not installed already.)

% export AZURE_SUBSCRIPTION_ID="$(cat sp.json | jq -r .subscriptionId | tr -d '\n')"
% export AZURE_CLIENT_SECRET="$(cat sp.json | jq -r .clientSecret | tr -d '\n')"
% export AZURE_CLIENT_ID="$(cat sp.json | jq -r .clientId | tr -d '\n')"
% export AZURE_TENANT_ID="$(cat sp.json | jq -r .tenantId | tr -d '\n')"

Now we’ll base-64 encode our credentials ...

% export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "$AZURE_SUBSCRIPTION_ID" | base64 | tr -d '\n')"
% export AZURE_TENANT_ID_B64="$(echo -n "$AZURE_TENANT_ID" | base64 | tr -d '\n')"
% export AZURE_CLIENT_ID_B64="$(echo -n "$AZURE_CLIENT_ID" | base64 | tr -d '\n')"
% export AZURE_CLIENT_SECRET_B64="$(echo -n "$AZURE_CLIENT_SECRET" | base64 | tr -d '\n')"

... and then configure critical details like our machine types:

% export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_D2s_v3"
% export AZURE_NODE_MACHINE_TYPE="Standard_D2s_v3"
% export AZURE_CLUSTER_IDENTITY_SECRET_NAME="cluster-identity-secret"
% export AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE="default"
% export CLUSTER_IDENTITY_NAME="cluster-identity"

On our local management cluster, we will create a secret to manage identity:

% kubectl create secret generic "${AZURE_CLUSTER_IDENTITY_SECRET_NAME}" --from-literal=clientSecret="${AZURE_CLIENT_SECRET}"

Finally, we’re ready to initialize the provider on our local management cluster, then create the new worker cluster on Azure:

% clusterctl init --infrastructure azure
% clusterctl generate cluster ${CLUSTER_NAME} --kubernetes-version ${KUBERNETES_VERSION} > cluster.yaml
% kubectl apply -f cluster.yaml

You can check the status of your new cluster with:

% kubectl get cluster-api -o wide

Our resources won’t reach a READY state until we set up a Container Network Interface (CNI). We can accomplish this by using clusterctl to download our new worker cluster’s kubeconfig…

% clusterctl get kubeconfig azure-worker > azure-worker.kubeconfig

…and then applying a Calico CNI implementation.

% kubectl --kubeconfig=./azure-worker.kubeconfig \
 apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/calico.yaml

Once the Azure worker cluster is ready, you can proceed to the next step.

Understanding and using machine resources

As we saw in the previous tutorial, Cluster API treats Kubernetes clusters and their constituent machines as Kubernetes resources like any other. Just as we have resource types for pods and services, we now have a resource type for clusters.

We also have several resource types for managing the machines underlying nodes, including:

Machine
MachineSet
MachineDeployment

These are all infrastructure machine templates, and you can think of them as existing in tiers of ownership: Machines belong to MachineSets, which in turn belong to MachineDeployments.

Machine is an abstraction for the infrastructure underlying a Kubernetes node, whether that’s a VM, bare metal, or what have you. In terms of usage, you can think of this like a Pod: it is in some sense the atomic unit of Cluster API, but it’s not a resource you would prefer to manage directly.

In the same way that an application may belong to a ReplicaSet, a group of like machines may be managed by a MachineSet. These also are not meant to be managed directly, but instead serve as a resource that may be manipulated by MachineDeployments.

MachineDeployments are aptly named, because they are to Machines as Deployments are to Pods: the managing abstraction for infrastructure machines. Specifications via Cluster API are intended to function as much as possible by standard Kubernetes principles: declaratively and immutably. So when we make an update to the specification—upgrading the Kubernetes version on the machines, for example—the MachineDeployment controller handles the process of gracefully replacing existing machines under its purview with new machines that meet the spec.

This is important to note: from the standpoint of Cluster API, machines metaphorically contravene the first law of thermodynamics: they are only created or destroyed, and never change form.

Scaling via a MachineDeployment

Now let’s create a machine. If we run…

% kubectl get machines

…we should see that we have four machines at the moment, since we have a control plane machine and the three worker machines we specified in our initial setup.

NAME                                 CLUSTER        NODENAME                           PHASE          AGE   VERSION
azure-worker-control-plane-jd1d6     azure-worker   azure-worker-control-plane-wmt26   Running        50m   v1.24.6
azure-worker-md-0-598f9c756b-5dt68   azure-worker   azure-worker-md-0-mjf4x            Running        49m   v1.24.6
azure-worker-md-0-598f9c756b-j6rrg   azure-worker   azure-worker-md-0-9mzvl            Running        49m   v1.24.6
azure-worker-md-0-598f9c756b-w2pmm   azure-worker   azure-worker-md-0-xkk2v            Running        49m   v1.24.6

Our worker machine names handily indicate that they belong to the azure-worker-md-0 MachineDeployment. So we can scale this imperatively with kubectl using a flag…

% kubectl scale machinedeployment azure-worker-md-0 --replicas=4

Now we can run kubectl get machines again…

NAME                                 CLUSTER        NODENAME                           PHASE          AGE   VERSION
azure-worker-control-plane-jd1d6     azure-worker   azure-worker-control-plane-wmt26   Running        60m   v1.24.6
azure-worker-md-0-598f9c756b-5dt68   azure-worker   azure-worker-md-0-mjf4x            Running        59m   v1.24.6
azure-worker-md-0-598f9c756b-j6rrg   azure-worker   azure-worker-md-0-9mzvl            Running        59m   v1.24.6
azure-worker-md-0-598f9c756b-w2pmm   azure-worker   azure-worker-md-0-xkk2v            Running        59m   v1.24.6
azure-worker-md-0-598f9c756b-mxt59   azure-worker   azure-worker-md-0-j9fds            Running        10m   v1.24.6

We have a new worker machine. This is, of course, a little quick-and-dirty. So for our next step, we’ll go about things more properly and declaratively, defining our specification in a manifest. Before we do, try scaling back down to three machines.

Upgrading via MachineDeployment

Let’s consider a slightly more challenging operation: upgrading the Kubernetes version across a worker cluster.

I’ve said that from Cluster API’s perspective, our machines are immutable. So what we’re really doing here is creating new machines with the desired Kubernetes version, then gracefully transitioning utilization from the old machines to the new and then deleting the originals.

Here we’re making a more consequential change, so let’s be good and do things declaratively. We’ll need to update the control plane machine first, followed by the worker machines, and the process will look similar for each.

First, we’ll download the current specification for the control plane resource with kubectl:

% kubectl get kubeadmcontrolplane azure-worker-control-plane -o yaml > control-plane.yaml

Make a copy of the file called control-plane-update.yaml and edit it so that the spec field looks as below. The primary changes you’ll be making to the manifest are:

Deleting all of the status fields

Changing what is now the final line, spec.version, to v1.24.10

spec:
 kubeadmConfigSpec:
   clusterConfiguration:
     apiServer:
       extraArgs:
         cloud-config: /etc/kubernetes/azure.json
         cloud-provider: azure
       extraVolumes:
       - hostPath: /etc/kubernetes/azure.json
         mountPath: /etc/kubernetes/azure.json
         name: cloud-config
         readOnly: true
       timeoutForControlPlane: 20m0s
     controllerManager:
       extraArgs:
         allocate-node-cidrs: "false"
         cloud-config: /etc/kubernetes/azure.json
         cloud-provider: azure
         cluster-name: azure-worker
       extraVolumes:
       - hostPath: /etc/kubernetes/azure.json
         mountPath: /etc/kubernetes/azure.json
         name: cloud-config
         readOnly: true
     dns: {}
     etcd:
       local:
         dataDir: /var/lib/etcddisk/etcd
         extraArgs:
           quota-backend-bytes: "8589934592"
     networking: {}
     scheduler: {}
   diskSetup:
     filesystems:
     - device: /dev/disk/azure/scsi1/lun0
       extraOpts:
       - -E
       - lazy_itable_init=1,lazy_journal_init=1
       filesystem: ext4
       label: etcd_disk
     - device: ephemeral0.1
       filesystem: ext4
       label: ephemeral0
       replaceFS: ntfs
     partitions:
     - device: /dev/disk/azure/scsi1/lun0
       layout: true
       overwrite: false
       tableType: gpt
   files:
   - contentFrom:
       secret:
         key: control-plane-azure.json
         name: azure-worker-control-plane-azure-json
     owner: root:root
     path: /etc/kubernetes/azure.json
     permissions: "0644"
   format: cloud-config
   initConfiguration:
     localAPIEndpoint: {}
     nodeRegistration:
       kubeletExtraArgs:
         azure-container-registry-config: /etc/kubernetes/azure.json
         cloud-config: /etc/kubernetes/azure.json
         cloud-provider: azure
       name: '{{ ds.meta_data["local_hostname"] }}'
   joinConfiguration:
     discovery: {}
     nodeRegistration:
       kubeletExtraArgs:
         azure-container-registry-config: /etc/kubernetes/azure.json
         cloud-config: /etc/kubernetes/azure.json
         cloud-provider: azure
       name: '{{ ds.meta_data["local_hostname"] }}'
   mounts:
   - - LABEL=etcd_disk
     - /var/lib/etcddisk
 machineTemplate:
   infrastructureRef:
     apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
     kind: AzureMachineTemplate
     name: azure-worker-control-plane
     namespace: default
   metadata: {}
 replicas: 1
 rolloutStrategy:
   rollingUpdate:
     maxSurge: 1
   type: RollingUpdate
 version: v1.24.10

Now we’ll apply the manifest…

% kubectl apply -f control-plane-update.yaml

The system will begin to create a new replica of the control plane machine, then transfer responsibility to that replica once it is ready, and finally delete the old machine. The process will take a few minutes, and you can observe it with:

% kubectl get kubeadmcontrolplane azure-worker-control-plane

At first, the output will show two replicas with one unavailable. Eventually, you will see one replica with zero unavailable—running on v1.24.10.

NAME                         CLUSTER        INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE   VERSION
azure-worker-control-plane   azure-worker   true          true                   1          1       1         0             35m   v1.24.10

Now we’re going to do the same thing with the worker machines via a MachineDeployment. Download a manifest for that resource:

% kubectl get machinedeployment azure-worker-md-0 -o yaml > md-0.yaml

Make a copy of the YAML file called md-0-update.yaml. Now we’ll edit the new copy. Note, in addition to the version specification, the field for spec.strategy.type. This takes two options: RollingUpdate or OnDelete.

RollingUpdate will facilitate the behavior I described above—after application of the manifest, the system will create new machines and begin a graceful transition before deleting the original machines.

By contrast, OnDelete will wait for an operator (whether a human being or a software agent) to delete the original machine before provisioning a replacement.

We’ll use RollingUpdate and specify Kubernetes version 1.24.10. Update the spec to look like so by…

Deleting all of the status fields
Changing what is now the final line, spec.version, to v1.24.10

spec:
 clusterName: azure-worker
 minReadySeconds: 0
 progressDeadlineSeconds: 600
 replicas: 3
 revisionHistoryLimit: 1
 selector:
   matchLabels:
     cluster.x-k8s.io/cluster-name: azure-worker
     cluster.x-k8s.io/deployment-name: azure-worker-md-0
 strategy:
   rollingUpdate:
     maxSurge: 1
     maxUnavailable: 0
   type: RollingUpdate
 template:
   metadata:
     labels:
       cluster.x-k8s.io/cluster-name: azure-worker
       cluster.x-k8s.io/deployment-name: azure-worker-md-0
   spec:
     bootstrap:
       configRef:
         apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
         kind: KubeadmConfigTemplate
         name: azure-worker-md-0
     clusterName: azure-worker
     infrastructureRef:
       apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
       kind: AzureMachineTemplate
       name: azure-worker-md-0
     version: v1.24.10

Now we can apply the manifest to propagate our changes to the worker cluster:

% kubectl apply -f md-0-update.yaml

If you run kubectl get machines, you’ll see the system beginning to provision new machines running the new version of Kubernetes:

NAME                                 CLUSTER        NODENAME                           PHASE          AGE   VERSION
azure-worker-control-plane-6btkg     azure-worker   azure-worker-control-plane-wmt26   Running        29m   v1.24.10
azure-worker-md-0-598f9c756b-5dt68   azure-worker   azure-worker-md-0-mjf4x            Running        59m   v1.24.6
azure-worker-md-0-598f9c756b-j6rrg   azure-worker   azure-worker-md-0-9mzvl            Running        59m   v1.24.6
azure-worker-md-0-598f9c756b-w2pmm   azure-worker   azure-worker-md-0-xkk2v            Running        59m   v1.24.6
azure-worker-md-0-6c64c9db4c-sc5vf   azure-worker                                      Provisioning   71s   v1.24.10

After a couple of minutes, we’ll see old worker machines starting to be replaced by the newly provisioned ones:

NAME                                 CLUSTER        NODENAME                           PHASE          AGE     VERSION
azure-worker-control-plane-6btkg     azure-worker   azure-worker-control-plane-wmt26   Running        32m     v1.24.10
azure-worker-md-0-598f9c756b-j6rrg   azure-worker   azure-worker-md-0-9mzvl            Running        61m     v1.24.6
azure-worker-md-0-598f9c756b-w2pmm   azure-worker   azure-worker-md-0-xkk2v            Running        61m     v1.24.6
azure-worker-md-0-6c64c9db4c-65999   azure-worker                                      Provisioning   9s      v1.24.10
azure-worker-md-0-6c64c9db4c-sc5vf   azure-worker   azure-worker-md-0-fnkcn            Running        3m36s   v1.24.10

Above, we see that machine sc5vf—now launched with v.1.24.10—is up and running and has replaced machine 5dt68. A new machine running the upgraded version has started to provision. In a few minutes more…

NAME                                 CLUSTER        NODENAME                           PHASE     AGE   VERSION
azure-worker-control-plane-6btkg     azure-worker   azure-worker-control-plane-wmt26   Running   48m   v1.24.10
azure-worker-md-0-6c64c9db4c-65999   azure-worker   azure-worker-md-0-lndqb            Running   16m   v1.24.10
azure-worker-md-0-6c64c9db4c-b975n   azure-worker   azure-worker-md-0-vb44l            Running   13m   v1.24.10
azure-worker-md-0-6c64c9db4c-sc5vf   azure-worker   azure-worker-md-0-fnkcn            Running   19m   v1.24.10

Our cluster is completely upgraded!

When you’re done playing around with your fancy new cluster, you can clean up with:

% kubectl delete cluster azure-worker
% kubectl delete azureclusteridentity cluster-identity

Conclusion and next steps

So far, so good! We’ve learned…

How Cluster API works
How to deploy clusters to two different infrastructure providers with Cluster API
How to perform basic tasks like scaling and upgrading machines on clusters managed by Cluster API

In the final tutorial of this series, we’ll explore how to unlock the real power of Cluster API by writing a simple application that uses the API to intelligently manage infrastructure across multiple providers.

How to use Cluster API to programmatically scale and upgrade Kubernetes clusters

Why use Cluster API?

Initializing the management cluster for Azure

Understanding and using machine resources

Scaling via a MachineDeployment

Upgrading via MachineDeployment

Conclusion and next steps

Recommended posts

The Journey to the Next Generation of Lens: A Modern Kubernetes IDE for the Future

Mirantis named a Challenger in 2024 Gartner Magic Quadrant for Container Management

Top 5 Kubernetes Security Challenges and Best Practices

Choose your cloud native journey.

Join Our Exclusive Newsletter

Join Our Exclusive Newsletter

Get started with k0s

Try Mirantis Kubernetes Engine for Free

Virtualization Solutions

Services

Platform

Company