One cloud to rule them all -- or is it?

Craig Anderson - September 15, 2016

So you’ve sold your organization on private cloud. Wonderful! But to get that ROI you’re looking for, you need to scale quickly and get paying customers from your organization to fund your growing cloud offerings.
It’s the typical Catch-22 situation when trying to do something on the scale of private cloud: You can’t afford to build it without paying customers, but you can’t get paying customers without a functional offering.
In the rush to break the cycle, you onboard more and more customers. You want to reach critical mass and become the de-facto choice within your organization. Maybe you even have some competition within your organization you have to edge out. Before long you end up taking anyone with money.
And who has money? In the enterprise, more often than not it's the bread and butter of the organization: the legacy workloads.
Promises are made. Assurances are given. Anything to onboard the customer. “Sure, come as you are, you won’t have to rewrite your application; there will be no/minimal impact to your legacy workloads!”
But there's a problem here. Legacy workloads -- that is, those large, vertically scaled behemoths that don't lend themselves to "cloud native" principles -- present both a risk and an opportunity when growing your private cloud, depending how they are handled.
(Note: Just because a workload has been virtualized does not make it "cloud-native". In fact, many virtualized workloads, even those implemented using SOA, service-oriented architecture, will not be cloud native. We'll talk more about classifying, categorizing and onboarding different workloads in a future article.)

"Legacy" cloud vs "Agile" cloud

The term "legacy cloud" may seem like a bit of an oxymoron, but hear me out. For years, surveys that ask people about their cloud use have had to include responses from people who considered vSphere cloud because the line between cloud and virtualization is largely irrelevant to most people.
Or at least it was, when there wasn't anything else.
But now there's a clear difference. Legacy cloud is geared towards these legacy workloads, while agile cloud is geared toward more "cloud native" workloads.
Let’s consider some example distinctions between a “Legacy Cloud” and an “Agile Cloud”. This table shows some of the design trade-offs between environments built to support legacy workloads versus those built without those restrictions:

Legacy Cloud	Agile Cloud
No new features/updates (platform stability emphasis), or very infrequently, limited & controlled	Regular/continuous deployment of latest and greatest features (platform agility emphasis)
Live Migration Support (redundancy in the platform instead of in the app), DRS (in case of ESXi hypervisors managed by VMWare)	Highly scalable and performant local storage, ability to support other performance enhancing features like huge pages. No live migration security and operational burdens.
VRRP for Neutron L3 router redundancy	DVR for network performance & scalability; apps built to handle failure of individual nodes
LACP bonding for compute node network redundancy	SR-IOV for network performance; apps built to handle failure of individual nodes
Bring your own (specific) hardware	Shared, standard hardware defrayed with tenant chargeback policies (white boxes)
ESXi hypervisor or bare metal as a service (Ironic) to insulate data plane, and/or separate controllers to insulate control plane	OpenStack reference KVM deployment

A common theme here are features that force you to choose whether you are designing for performance & scalability (such as Neutron DVR) versus HA and resiliency (such as VRRP for Neutron L3 agents).

It’s one or the other, so introducing legacy workloads into your existing cloud can conflict with other objectives, such as increasing development velocity.

So what do you do about it?

If you find yourself in this situation, you basically have three choices:

Onboard tenants with legacy workloads and force them to potentially rewrite their entire application stack for cloud
Onboard tenants with legacy workloads into the cloud and hope everything works
Decline to onboard tenants/applications that are not cloud-ready

None of these are great options. You want workloads to run reliably, but you also want to make the onboarding process easy without imposing large barriers of entry to tenants applications.

Fortunately, there's one more option: split your cloud infrastructure according to the types of workloads, and engineer a platform offering for each. Now, that doesn't necessarily mean a separate cloud.
The main idea is to architect your cloud so that you can provide a legacy-type environment for legacy workloads without compromising your vision for cloud-aware applications. There are two ways to do that:

Set up a separate cloud with an entirely new control plane for associated compute capacity. This option offers a complete decoupling between workloads, and allows for changes/updates/upgrades to be isolated to other environments without exposing legacy workloads to this risk.
Use compute nodes such as ESXi hypervisor or bare metal (e.g., Ironic) for legacy workloads. This option maintains a single OpenStack control plane while still helping isolate workloads from OpenStack upgrades, disruptions, and maintenance activities in your cloud. For example, ESXi networking is separate from Neutron, and bare metal is your ticket out of being the bad guy for rebooting hypervisors to apply kernel security updates.

Keep in mind that these aren’t mutually exclusive options; it is possible to do both.
Of course each option come with their own downsides as well; an additional control plane involves additional overhead (to build and operate), and running a mixed hypervisor environment has its own set of engineering challenges, complications, and limitations. Both options also add overhead when it comes to repurposing hardware.

There's no instant transition

Many organizations get caught up in the “One Cloud To Rule Them All” mentality, trying to make everything the same and work with a single architecture to achieve the needed economies of scale, but ultimately the final decision should be made according to your situation.

It's important to remember that no matter what you do, you will have to deal with a transition period, which means you need to provide a viable path for your legacy tenants/apps to gradually make the switch. But first, asses your situation:

If your workloads are all of the same type, then there’s not a strong case to offer separate platforms out of the gate. Or, if you’re just getting started with cloud in your organization, it may be premature to do so; you may not yet have the required scale, or you may be happy with onboarding only those applications which are cloud ready.
When you have different types of workloads, with different needs -- for example, Telco/NFV vs Enteprise/IT vs BigData/IoT workloads -- you may want to think about different availability zones inside the same cloud, so specific nuances for each type can be addressed inside it’s own zone while maintaining one cloud configuration, life cycle management and service assurance perspective, including having similar hardware. (Having similar hardware makes it easier to keep spares on hand.)
If you find yourself in a situation where you want to innovate with your cloud platform, but you still need to deal with legacy workloads with conflicting requirements, then workload segmentation is highly advisable. In this case, you'll probably want to break from the “One Cloud” mentality in favor of the flexibility of multiple clouds If you try to satisfy both your "innovation" mindset and your legacy workload holders on one cloud, you'll likely disappoint both.

After making this choice, you may then plan your transition path accordingly.

Moving forward

Even if you do create a separate legacy cloud, you probably don't want to maintain it in perpetuity. Think about your transition strategy; a basic and effective carrot and stick approach is to limit new features and cloud-native functionality to your agile cloud, and to bill/chargeback at higher rates in your legacy cloud (which are, at any rate, justified by the costs incurred to provide and support this option).
Whatever you ultimately decide, the most important thing to do is make sure you've planned it out appropriately, rather than just going with the flow, so to speak. If you need to, contact a vendor such as Mirantis; they can help you do your planning and get to production as quickly as possible.

Get started with k0s

Zero-friction Kubernetes for any infrastructure.

GET STARTED

ENTERPRISE SUPPORT:

Trust the cloud native infrastructure experts

Mirantis OpsCare and OpsCare Plus

LEARN MORE

One cloud to rule them all -- or is it?

"Legacy" cloud vs "Agile" cloud

So what do you do about it?

There's no instant transition

Moving forward

Recommended posts

Mirantis Container Runtime fixes Docker Engine vulnerability affecting upstream Moby

Kubernetes vs. Philippine Power Outages - On setting up k0s over Tailscale

How to Add a Cluster to Lens: A Step-by-Step Guide

Choose your cloud native journey.

Cloud Native & Coffee

Join Our Exclusive Newsletter

Get started with k0s

Trust the cloud native infrastructure experts

Digital Self-Determination

Services

Platform

Company