Ceph is a de-facto standard in building robust distributed storage systems. It enables users to get a reliable, highly available, and easily scalable storage cluster using commodity hardware. Also, Ceph is becoming a storage basis for production OpenStack clusters.
There are several ways of managing Ceph clusters, including:
- Using the ceph-deploy tool
- Using custom in-house or open source manifests for configuration management software such as Puppet or Ansible
- Using standalone solutions such as 01.org VSM or Fuel
Another solution in that third bucket is Decapod, a standalone solution that simplifies deployment of clusters and management of their lifecycles.
In this article, we’ll compare the different means for deploying Ceph.
Deployment using ceph-deploy
The ceph-deploy tool is available with Ceph itself. According to the official documentation:
The ceph-deploy tool is a way to deploy Ceph relying only upon SSH access to the servers, sudo, and some Python. It runs on your workstation, and does not require servers, databases, or any other tools. If you set up and tear down Ceph clusters a lot, and want minimal extra bureaucracy, ceph-deploy is an ideal tool. The ceph-deploy tool is not a generic deployment system. It was designed exclusively for Ceph users who want to get Ceph up and running quickly with sensible initial configuration settings without the overhead of installing Chef, Puppet or Juju. Users who want fine-control over security settings, partitions or directory locations should use a tool such as Juju, Puppet, Chef or Crowbar.
As described, ceph-deploy is mostly limited to some quick cluster deployment. This is perfectly applicable for deploying a test environment, but production deployment still requires a lot of thorough configuration using external tools.
Deployment using manifests for configuration management tools
Configuration management tools enable you to deploy Ceph clusters as while maintaining great possibilities to tune the cluster. It is also possible to scale or shrink these clusters using the same code base.
The only problem here is high learning curve of such solutions: you need to know, in detail, every configuration option, and you need to read the source code of manifests/playbooks/formulas to understand in detail how they works.
Also, in most cases these manifests focus on a single use case: cluster deployment. They do not provide enough possibilities to manage the cluster after it is up and running. When you operate the cluster, if you need to extend it with new machines, disable existing machines to do maintenance, reconfigure hosts to add new storage pools or hardware, and so on, you will need to create and debug new manifests by yourself.
Decapod and 01.org VSM are examples of standalone configuration tools. They provide you with a unified view of the whole storage system, eliminating the need to understand low level details of cluster management. They integrate with a monitoring system, and they simplify operations on the cluster. They both have a low learning curve, providing best management practices with a simple interface.
Unfortunately, VSM has some flaws, including the following:
- It has tightly coupled business and automation logic, which makes it hard to extend the tool, or even customize some deployment steps
- By design, it is limited in scale. It works great for small clusters, but at a bigger scale the software itself becomes a bottleneck
- It lacks community support
- It has an overcomplicated design
Decapod takes a slightly different approach: it separates provisioning and management logic from the start, using an official community project, ceph-ansible. Decapod uses Ansible to do all remote management work, and uses its proven ability to create scalable deployments.
The Decapod architecture
Since Decapod uses Ansible to manage remote nodes, it does not need a complex architecture. Moreover, we’ve been trying to keep it as simple as possible. The architecture looks like this:
As you can see, Decapod has two main services: API and controller.
The API service is responsible for management entities and the handling of HTTP requests. If you request execution of an action on a Ceph node, the API service creates the task in the database for the controller. Each request for that task returns its status.
The Controller listens for new tasks in the database, prepares Ansible for execution (generates Ansible inventory, injects variables for playbooks) and tracks the progress of execution. Every step of the execution is trackable in the UI. You can also download the whole log afterwards.
Decapod performs every management action using a plugin, including cluster deployment and purging object storage daemons from hosts. Basically, a plugin is a playbook to execute, and a Python class used to generate the correct variables and dynamic inventory for Ansible based on the incoming class. Installation is dynamically extendable, so there is no need to redeploy Decapod with another set of plugins. Also, each plugin provides a set of sensible settings for your current setup, but if you want, you may modify every aspect and each setting.
Decapod has rich CLI and UI interfaces, which enable you to manage clusters. We gave a lot of attention to the UI because we believe that a good interface can help users to accomplish their goals without paying a lot of attention to low level details. If you want to do some operation work on a cluster, Decapod will try to help you with the most sensible settings possible.
Also, another important feature of Decapod is its ability to audit changes. Every action or operation on the cluster is trackable, and it is always possible to check the history of modifications for every entity, from its history of execution on a certain cluster to changes in the name of a user.
The Decapod workflow is rather simple, and involves a traditional user/role permission based model of access. To deploy a cluster, you needs to create it, providing a name for the deployed cluster. After that, you select the management action you wants to do, and select the required servers, and Decapod will generate sensible defaults for that action. If you’re is satisfied, you can execute this action in a single click. If not, you can tune these defaults.
You may find more information about using Decapod in our demo:
So what do you think? What are you using to deploy Ceph now, and how do you think Decapod will affect your workflow? Leave us a comment and let us know.