Pumphouse Deep Dive, Part 1

Oleg Gelbukh, Mirantis Blog - December 2, 2014 -

Welcome back to Mirantis Labs. In the introductory article for the Pumphouse series, we gave you an overview of what Pumphouse does; now we’ll go into more detail about how Pumphouse actually migrates workloads between OpenStack clouds. In this Deep Dive series we’ll talk specifically about version 1.0 of Pumphouse, and also look at any limitations and how we’ll be addressing them in future versions. In this article we’ll look specifically at the process of migrating resources from one cloud to another.

Resource Migration

The ultimate goal of Pumphouse is to be able to migrate any type of resource, provided by any OpenStack service, from one cloud to another. In version 1.0, however, the list of resources is limited to those that are most important.

As I mentioned in part 1, the smallest unit of workload that is sensible to migrate is a virtual server, or VM, However, the server itself cannot exist in a cloud without a bunch of other resources, which we call server dependencies, so Pumphouse must replicate all of those resources in the Destination cloud prior to moving the server itself.

At its most basic level, Pumphouse requests information about the server from Nova via the Compute API. This information includes references to the server’s dependencies. Usually, those references are IDs that allow Pumphouse to fetch the meta-data for those dependency resources and build a list of them.

Once the list has been prepared, Pumphouse builds an unordered flow of tasks that replicate the resources in question in the Destination cloud. Due to how Taskflow handles the unordered flow pattern, these tasks are executed in parallel.

Dependency Resources

In version 1.0, Pumphouse supports the following dependency resources, or meta-resources, for virtual servers:

  • flavor defines resources allocated to a server instance in the cloud.
  • An image is required to boot the server if the ‘image’ migration strategy is chosen. (We’ll discuss the various strategies below.)
  • Security groups define permissions for network access to the server. A single server may have more than one security group assigned to it.
  • Networks connect the virtual server instance to the outside world.
  • Floating ips are used to access the server from external networks.
  • Identity is a combination of authentication/authorization resources that allows the original owner of the server (and other resources as well) to manage it after migration.

Most of these resources are replicated simply by passing attributes fetched from the Source cloud APIs to the Destination cloud APIs. Pumphouse maintains a mapping between the identifiers (usually UUIDs) of corresponding resources in the Source and Destination clouds within the Taskflow store. Later, when migrating the server itself, it fetches the IDs of the dependency resources created in the Destination cloud from that store and uses them in the creation of the new server.

Identity Migration

Identity resources include tenants, roles and users from Keystone. Additionally, they include user-role assignments and user credentials such as password hashes.

The main difficulty in terms of identity migration is that Pumphouse must preserve ownership of resources in the Destination cloud. To achieve that, it has to talk to APIs on behalf of users who own resources it is trying to replicate. Version 1.0 of Pumphouse achieves this feat by generating a password for the user in the Destination cloud when migration begins, performing all operations for that user under this login, and then setting the password to the original value from the Source cloud via the database.

Server migration strategies

Pumphouse supports two options for migrating virtual servers in version 1.0: image- and snapshot-based migration. Although these strategies are similar, there are important differences.

Image-based migration

With image-based migration, Pumphouse suspends the server chosen for migration in the Source cloud, translates its meta-data based on the information about the migration of dependency resources from the Taskflow store, and issues a servers.boot request to the Destination cloud’s Compute API with the modified meta-data. The server boots from the Destination Glance image that is a copy of the image from which the server was instantiated in the Source cloud.

This is a simplified representation of the flow that implements server migration. Bold arrows represent direct links between tasks in a linear flow. Dashed arrows are indirect dependencies that result when the output of one task (arrow source) serves as input to another task (arrow target). UUIDs of resources have been replaced with capital letters for readability. Most meta-resources were excluded from this diagram.

This is a simplified representation of the flow that implements server migration. Bold arrows represent direct links between tasks in a linear flow. Dashed arrows are indirect dependencies that result when the output of one task (arrow source) serves as input to another task (arrow target). UUIDs of resources have been replaced with capital letters for readability. Most meta-resources were excluded from this diagram.

If the boot request succeeds and the server becomes ‘active’ in the Destination cloud within the expected time, Pumphouse terminates the original instance of the virtual server in the Source cloud. This migration is fairly quick, however it is only suitable for servers that run completely stateless applications, as no data from the server’s ephemeral disk in the Source cloud will be available to server’s instance in the Destination cloud.

Snapshot-based migration

The snapshot-based migration scenario includes an additional step after the suspension of the server in the Source cloud. Pumphouse creates a snapshot of the server in the Glance store of the Source cloud and copies that snapshot to the Destination cloud as an ordinary image. It then boots the virtual server in the Destination cloud from that snapshot instead of the copy of the original image.

This graph depicts a flow that migrates the server by making a snapshot of it. Pumphouse copies the snapshot to the Destination cloud and uses it to boot the migrated instance of the server. Note that the graph was simplified for readability by excluding all meta-resources.
This graph depicts a flow that migrates the server by making a snapshot of it. Pumphouse copies the snapshot to the Destination cloud and uses it to boot the migrated instance of the server. Note that the graph was simplified for readability by excluding all meta-resources.

Snapshot migration allows you to preserve all data written to the virtual server’s ephemeral disk by applications and users. However, with snapshot-based migration, the server will be unavailable for the entire time needed to take the snapshot and transfer it between the Source and Destination Glance service, which, depending on the flavor of the instance, can be significant.

How fast is it?

Resource migration with images is relatively fast in terms of user-visible downtime: it only takes as much as a half-dozen HTTP requests and the time required to actually boot the virtual server instance.

This diagram shows results of the migration benchmark. We migrated 15 'medium' virtual servers, each with 2 vCPUs, 4Gb of RAM, and 40 GB of ephemeral storage. Every server was assigned a floating IP address, which is used for ICMP and HTTP probing. All 15 servers use the same image. The time to migrate the first server includes time to cache the image to Compute node. In this graph, time is noted in seconds.
This diagram shows results of the migration benchmark. We migrated 15 ‘medium’ virtual servers, each with 2 vCPUs, 4Gb of RAM, and 40 GB of ephemeral storage. Every server was assigned a floating IP address, which is used for ICMP and HTTP probing. All 15 servers use the same image. The time to migrate the first server includes time to cache the image to Compute node. In this graph, time is noted in seconds.

The Taskflow parallel engine allows Pumphouse to run multiple migrations at one time. Our benchmarks suggest that it usually takes about 30 to 60 seconds to boot a medium or large flavor virtual server. The flavor itself does not affect the time to boot the server; image size has an effect only in cases when Compute needs to cache the image.

In Conclusion…

This part of the Pumphouse Deep Dive series described the concepts behind the migration of virtual resources between OpenStack clouds. The most difficult challenge we’ve faced with virtual resources is the complex structure of dependencies between them, which we solved by leveraging Taskflow and its explicit graph-oriented structure of tasks.

In the following posts of Deep Dive we’ll see how Pumphouse manages bare metal to upgrade hypervisor hosts and reassign them to the target cloud. We’ll also cover limitations of the current version in a separate article. Stay tuned for updates from Mirantis Labs!

banner-img
From Virtualization to Containerization
Learn how to move from monolithic to microservices in this free eBook
Download Now
Radio Cloud Native – Week of May 11th, 2022

Every Wednesday, Nick Chase and Eric Gregory from Mirantis go over the week’s cloud native and industry news. This week they discussed: Docker Extensions Artificial Intelligence shows signs that it's reaching the common person Google Cloud TPU VMs reach general availability Google buys MobileX, folds into Google Cloud NIST changes Palantir is back, and it's got a Blanket Purchase Agreement at the Department of Health and Human …

Radio Cloud Native – Week of May 11th, 2022
Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!

In the last several weeks we have released two updates to Mirantis Container Cloud - versions 2.16 and 2.17, which bring a number of important changes and enhancements. These are focused on both keeping key components up to date to provide the latest functionality and security fixes, and also delivering new functionalities for our customers to take advantage of in …

Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!
Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]

Cloud environments & Kubernetes are becoming more and more expensive to operate and manage. In this demo-rich workshop, Mirantis and Kubecost demonstrate how to deploy Kubecost as a Helm chart on top of Mirantis Kubernetes Engine. Lens users will be able to visualize their Kubernetes spend directly in the Lens desktop application, allowing users to view spend and costs efficiently …

Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]
LIVE WEBINAR
Manage your cloud-native container environment with Mirantis Container Cloud

Wednesday, January 5 at 10:00 am PST
SAVE SEAT
LIVE WEBINAR
Istio in the Enterprise: Security & Scale Out Challenges for Microservices in k8s

Presented with Tetrate
SAVE SEAT
Mirantis Webstore
Purchase Kubernetes support
SHOP NOW