How to Use Cluster API to Programmatically Configure and Deploy Kubernetes Clusters
Kubernetes provides a portable, resilient, and resource-aware substrate for code that lives on the cloud. As Kubernetes has grown more popular, cloud infrastructure patterns have grown more complex: organizations may run hundreds or thousands of Kubernetes clusters at edge sites, and workloads may be orchestrated across multiple cloud providers or on-prem datacenters.
That complexity calls for a unified interface through which clusters may be provisioned and managed, programmatically and at scale. The Kubernetes community created just such a tool in Cluster API.
In this tutorial, we will…
Explain the fundamentals of Cluster API—what it is, how it works, and the core concepts that govern its design
Show you how to set up a local management cluster that will enable you to deploy new clusters to a cloud provider such as AWS
Walk you through the process of provisioning a workload cluster on AWS, and explain how to interact with that cluster via Lens
This primer will serve as a foundation for further tutorials on Cluster API, in which we will show you how to control clusters across multiple environments and create a management interface that responds programmatically to your needs.
Ready? Let’s get started!
What is the Cluster API?
Cluster API is a tool for programmatically configuring and deploying Kubernetes clusters on a variety of different infrastructures. It is open source and maintained as a sub-project of Kubernetes; as of this writing it is in version 1.2.4 and is regarded as production-ready.
For example, Cluster API is an upstream component of Mirantis Container Cloud, which uses the API to deliver a powerful infrastructure control plane managed through a single-pane GUI, while integrating management for components like Ceph storage.
What is the purpose of Cluster API?
Cluster API begins from a simple premise:
What if we could apply the “Kubernetes way of doing things”—that is, declarative configuration and deployment via API—to the provisioning and management of clusters themselves. What if spinning up a cluster to your specifications was functionally no different than deploying a Kubernetes resource like a Pod?
The answer to that “What if?” is that clusters would be much easier to configure and deploy programmatically and at scale—the same as any other Kubernetes resource. And that’s very useful for operators who want to manage lifecycles for hundreds or even thousands of clusters: whether those are edge clusters at retail locations, individual developer clusters, or clusters dedicated to sensitive, isolated workloads.
How does Cluster API work?
Cluster API not only gives us a way to manage Kubernetes clusters; it also runs on Kubernetes. Ultimately, Cluster API is simply a set of components that we install on a cluster and interact with in Kube-like style, through Kube-like tools.
This raises a sort of chicken-and-egg problem: as operators, we will need to provide a first cluster from which all of our other clusters will be initiated. This initial cluster may be temporary—a bootstrap cluster that will be discarded later—or it may serve an important and ongoing role as a management cluster.
In the conventions of Cluster API, the management cluster is one of two essential cluster types:
Management clusters: These clusters are responsible for the creation and oversight of other clusters through Cluster API. They are your agents for infrastructure management, and in that respect are a bit of an oddity—they’re probably not running application workloads like a normal Kubernetes cluster, but instead focusing entirely on provisioning, monitoring, and managing other clusters.
Workload clusters: These are the clusters that will actually handle application workloads for your users—the business-as-usual clusters that do exactly what you would expect a Kubernetes cluster to do, running microservices and handling requests.
This framework should sound familiar—it is, of course, reminiscent of the manager and worker nodes within a given Kubernetes cluster.
In any case, to get started with Cluster API, we need only furnish a few prerequisites:
An initial cluster: This is the starting cluster we will need to create the rest, and it can live in a lot of different places—on any number of clouds or on our local machine.
kubectl: You’ll need the kubectl CLI installed on your workstation and all set to control your initial cluster.
A provider: “Provider” is Cluster API’s abstraction for the infrastructure on which your newly created clusters will run. That might mean the big public cloud providers such as AWS, Azure, and Google Cloud, but it could also refer to more specialized providers such as Equinix Metal or DigitalOcean, or a host infrastructure such as OpenStack.
With these pieces in place, we will be able to configure and create clusters at scale entirely programmatically.
Indeed, if you were so inclined, you could use the clusters you provision with Cluster API to run Cluster API components and provision yet another cluster. You can see, then, why the Cluster API logo is three turtles with Kubernetes logos on their shells: it’s Kubernetes turtles all the way down.
Deploying a cluster with Cluster API
In this walkthrough, we will use a local development cluster as our management cluster, and we will deploy a workload cluster to AWS.
You can launch a local development cluster with Lens Desktop Kubernetes (a feature of Lens Pro), the Lens for Docker Desktop extension, Minikube, or another dev cluster implementation of your choice. Go ahead and start your local cluster now.
Next we’ll install the clusterctl command line tool. Download the binary from GitHub (making sure to grab the correct binary for your system):
% curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.2.4/clusterctl-darwin-amd64 -o clusterctl
From the same directory, use chmod to modify permissions so the binary is executable.
% chmod +x ./clusterctl
Put the clusterctl binary in your PATH.
% sudo mv ./clusterctl /usr/local/bin/clusterctl
The clusterctl CLI tool should be installed now. In a moment, we’re going to use it to initialize our local Kubernetes cluster as a management cluster. But before we do, we need to do some configuration for the provider to which we intend to deploy workload clusters—in this case, AWS.
Initializing the management cluster for AWS
We could configure for multiple providers, at this point, and clusterctl would enable us to manage all of them—enabling us to create and manage hybrid/multi-cloud architecture from a single interface. But for this primer, we’ll keep things simple and stick to a single provider.
Now, in order to do our AWS configuration, we’re going to download another CLI tool called clusterawsadm that will help us generate a CloudFormation stack with appropriate IAM resources. (CloudFormation is an AWS tool for infrastructure-as-code automation.) We’ll install this tool exactly the same way we did with clusterctl—again, verifying that you have the right binary for your system:
% curl -L https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v1.5.0/clusterawsadm-darwin-amd64 -o clusterawsadm % chmod +x clusterawsadm % sudo mv clusterawsadm /usr/local/bin
Note: The rest of this walkthrough uses AWS credentials and, once a workload cluster is deployed, will incur some costs if you follow along—proceed advisedly! Make sure to use the credentials for an IAM user with policies that make sense for you.
The clusterawsadm tool draws on a set of environment variables in order to run its configuration. Now we’ll use the export command to define those environment variables:
% export AWS_REGION=us-east-1 % export AWS_ACCESS_KEY_ID=<Your access key> % export AWS_SECRET_ACCESS_KEY=<Your secret access key> % export AWS_SESSION_TOKEN=<Session token>
Note: the session token is only necessary if you’re using multi-factor authentication.
With those environment variables defined, we can use clusterawsadm to generate our CloudFormation stack:
% clusterawsadm bootstrap iam create-cloudformation-stack
The system will return…
Attempting to create AWS CloudFormation stack cluster-api-provider-aws-sigs-k8s-io
…and it might take a moment. When it’s done, you’ll get a completion notification for various resources that looks like this:
AWS::IAM::Role |nodes.cluster-api-provider-aws.sigs.k8s.io |CREATE_COMPLETE
Now we’ll create another environment variable—this one with our newly-created credentials, base64-encoded.
% export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)
And finally, three last environment variables which will be required to specify details about our new workload clusters.
For the SSH key, you’ll need to actually have a key pair named default—alternatively, change the value here to an existing key pair. If you need to create an SSH key pair, you can do that on the Key pairs page found under Network & Security in the EC2 menu.
% export AWS_SSH_KEY_NAME=default % export AWS_CONTROL_PLANE_MACHINE_TYPE=t3.large % export AWS_NODE_MACHINE_TYPE=t3.large
That’s it for our AWS-specific configuration. Now we’re ready to initialize our local cluster as a management cluster—with resources for deploying clusters to the provider AWS.
% clusterctl init --infrastructure aws
Your output should look something like this:
Fetching providers Installing cert-manager Version="v1.9.1" Waiting for cert-manager to be available... Installing Provider="cluster-api" Version="v1.2.4" TargetNamespace="capi-system" Installing Provider="bootstrap-kubeadm" Version="v1.2.4" TargetNamespace="capi-kubeadm-bootstrap-system" Installing Provider="control-plane-kubeadm" Version="v1.2.4" TargetNamespace="capi-kubeadm-control-plane-system" Installing Provider="infrastructure-aws" Version="v1.5.0" TargetNamespace="capa-system" Your management cluster has been initialized successfully!
Before we move on, let’s take a moment to look at what we’ve done. If we take a look at our namespaces, we’ll see that we have several new namespaces in our management cluster dedicated to Cluster API tooling:
% kubectl get ns NAME STATUS AGE capd-system Active 43h capi-kubeadm-bootstrap-system Active 43h capi-kubeadm-control-plane-system Active 43h capi-system Active 43h cert-manager Active 43h
We’ve also got quite a few new Custom Resource Definitions (CRDs). Here are a select handful:
% kubectl get crds NAME CREATED AT clusterclasses.cluster.x-k8s.io 2022-10-17T19:19:36Z clusterissuers.cert-manager.io 2022-10-17T19:18:31Z clusterresourcesetbindings.addons.cluster.x-k8s.io 2022-10-17T19:19:36Z clusterresourcesets.addons.cluster.x-k8s.io 2022-10-17T19:19:36Z clusters.cluster.x-k8s.io 2022-10-17T19:19:36Z machinedeployments.cluster.x-k8s.io 2022-10-17T19:19:37Z machinehealthchecks.cluster.x-k8s.io 2022-10-17T19:19:37Z machinepools.cluster.x-k8s.io 2022-10-17T19:19:37Z machines.cluster.x-k8s.io 2022-10-17T19:19:38Z machinesets.cluster.x-k8s.io 2022-10-17T19:19:38Z providers.clusterctl.cluster.x-k8s.io 2022-10-17T19:18:17Z
These CRDs are really the core mechanism of Cluster API, enabling the management cluster to handle, say, clusters and machines as Kubernetes resources.
Creating and managing a workload cluster
Now we’re ready to generate a workload cluster! We’ll use clusterctl’s generate cluster command to create a YAML manifest in our working directory—which we can use, in turn, to deploy our cluster.
% clusterctl generate cluster test-cluster --infrastructure aws --kubernetes-version v1.25.0 --control-plane-machine-count=3 --worker-machine-count=3 > test-cluster.yaml
Note that we’re provisioning three worker machines for a reason: this is the requirement for minimum availability.
Let’s take a look at the YAML manifest we generated. Open the file with the code or text editor of your choice. You’ll see multiple manifests in this one file, defining several different kinds of resources:
Here are several of those new Kubernetes resource types that we added through CRDs. Look through the manifests and observe how many of the details we’ve specified so far are rendered in the YAML. I’ll zoom in on just one of those manifests here:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: AWSCluster metadata: name: test-cluster namespace: default spec: region: us-east-1 sshKeyName: default
Here we have an abstraction for an AWS-hosted cluster. It includes details like region and sshKeyName that we specified earlier through environment variables. And it’s linked to the resource instance of our more general Cluster object by name: test-cluster.
Let’s provision our workload cluster. From the directory where test-cluster.yaml is stored:
% kubectl apply -f test-cluster.yaml
Your output should look something like this:
cluster.cluster.x-k8s.io/test-cluster created awscluster.infrastructure.cluster.x-k8s.io/test-cluster created kubeadmcontrolplane.controlplane.cluster.x-k8s.io/test-cluster-control-plane created awsmachinetemplate.infrastructure.cluster.x-k8s.io/test-cluster-control-plane created machinedeployment.cluster.x-k8s.io/test-cluster-md-0 created awsmachinetemplate.infrastructure.cluster.x-k8s.io/test-cluster-md-0 created kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/test-cluster-md-0 created
Now we can use kubectl to manage those custom resource types. We can run…
% kubectl get clusters NAME PHASE AGE test-cluster Provisioned 30s
…and this gives us a view of all our clusters managed by Cluster API. We can get more granular and view all of our machines:
% kubectl get machines NAME CLUSTER test-cluster-control-plane-9cz2s test-cluster
Or perhaps we only wish to see our workload clusters on AWS:
% kubectl get awsclusters NAME CLUSTER READY test-cluster test-cluster true
The clusterctl CLI tool can help us gain even more insight. Run:
% clusterctl describe cluster test-cluster
The output should look like this:
NAME READY SEVERITY REASON SINCE SINCE MESSAGE Cluster/test-cluster True 28s ├─ClusterInfrastructure True 5m ├─ControlPlane True 28s │ └─3 Machines... True 2m └─Workers └─MachineDeployment/test-cluster-md-0 False Warning 10m └─3 Machines... True 2m11s
(I’ve omitted some detail for legibility here—the real output will give you names for all your nodes and more informative warnings for unready machines.)
Your results may not show the control plane as ready, but give it a couple of minutes and it should get there. The worker MachineDeployment will not become ready, however, because our new cluster is missing a final ingredient: a Container Network Interface (CNI) plugin to handle cluster networking.
Once a control plane node is ready, we can communicate with it—and our first order of business is to grab the kubeconfigfor the worker cluster. The clusterctl CLI makes this easy:
% clusterctl get kubeconfig test-cluster > test-cluster.kubeconfig
The command above will download a file called test-cluster.kubeconfig to our current working directory. Now we can use that kubeconfig to manage our new worker cluster, and the first thing we will do is add the open source Calico CNI plugin, which provides network connectivity between workloads:
% kubectl --kubeconfig=./test-cluster.kubeconfig \ apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.24.1/manifests/calico.yaml
If you check out the Calico YAML, or simply skim the console output, you’ll see that we’re adding a pile of Custom Resource Definitions, a Controller, and several other resources:
poddisruptionbudget.policy/calico-kube-controllers created serviceaccount/calico-kube-controllers created serviceaccount/calico-node created configmap/calico-config created customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created clusterrole.rbac.authorization.k8s.io/calico-node created clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created clusterrolebinding.rbac.authorization.k8s.io/calico-node created daemonset.apps/calico-node created deployment.apps/calico-kube-controllers created
Now that we’ve added Calico, the worker nodes in our workload cluster should quickly become ready:
% clusterctl describe cluster test-cluster
The output for a fully ready cluster:
NAME READY SEVERITY REASON SINCE Cluster/test-cluster True 4m48s ├─ClusterInfrastructure True 9m20s ├─ControlPlane True 4m48s │ └─3 Machines... True 6m20s └─Workers └─MachineDeployment/test-cluster-md-0 True 78s └─3 Machines... True 6m31s
At this point, we’re fully operational, and we can use the workload cluster’s kubeconfig to do whatever we need to do on the cluster.
For ongoing interaction with the cluster, we could continue to specify the kubeconfig with kubectl, or merge it with our local kubeconfig and switch contexts, but one of the easier ways to hop between cluster contexts is using Lens.
After downloading and starting Lens, simply click on the plus sign in the lower-right corner and select Sync kubeconfig.
Now you can easily switch between clusters and manage your workload clusters, install charts with a few clicks via Lens’ Helm interface, or just as easily set up a Prometheus monitoring stack.
Here we can see all of our nodes organized for easy monitoring and management:
When we’re done with the workload cluster, the clean-up is mercifully easy:
% kubectl delete cluster test-cluster cluster.cluster.x-k8s.io "test-cluster" deleted
That brings us to the end of this introductory primer, but it’s by no means the end of what you can do with Cluster API. In future walkthroughs, we’ll show you how to programmatically control clusters in other environments, and how to create a hybrid environment that is programmatically responsive to your needs.