Mistral: Workflow-as-a-Service for OpenStack
December 5, 2013
This is the first in a series of posts about the Mistral project. Here we present a simple Mistral workflow, while future posts will dive into some proof-of-concept details, development tasks, and production-stage use cases.
In mid-October, Mirantis kicked off a new project codenamed Mistral. This project, also known as OpenStack Workflow-as-a-Service, is based on the ideas from the Convection proposal and the TaskFlow library developed by Yahoo!’s Joshua Harlow and his team. From the very beginning, Joshua has been very positive about Mistral and contributed many interesting ideas that we put on our roadmap, along with some constructive criticism (which is even more valuable sometimes!)
Mistral was first widely presented at the Hong Kong Design Summit where it garnered a serious endorsement from major OpenStack community players. In this first post about Mistral, we’ll explain what Mistral is and how it’s going to be helpful for OpenStack users. We’ll intentionally avoid diving too deeply into technical details for now and rather concentrate on a use case.
As we’ve already mentioned, Mistral grew from the Convection proposal document. Basically, it addresses several important needs typically occurring in a distributed/cloud environment. To make a long story short, their shared essence is running sequences of cloud tasks in a unified manner, utilizing the same reusable mechanism. That simply means we can define such a sequence using a special syntax, or Domain Specific Language (DSL), whereby we specify all of the tasks, what they should do, and when, and have a special service coordinate the execution of the tasks according to this definition. This service would provide all of the necessary control points to manage the execution of the tasks (suspend/resume) and observe their state, for example, to find out whether a particular task or a sequence of tasks has successfully finished or has failed.
So it’s not that difficult to see that Mistral was designed to be that kind of service in the OpenStack infrastructure.
Does it all sound complicated? No worries, it’ll become clear as you read on.
Cloud Cron Use Case
The use case described here is referred to as “Cloud Cron.” The word cloud here is important. It means that we can combine cloud-meaningful tasks like “Create a VM” or “Upload an image into Glance” as well as low-level VM tasks like cleaning up log files into a single workflow and run it periodically or on demand. Or, alternatively, tasks can just be webhooks called on a specific schedule, which may be useful for a number of cases.
To better understand Mistral’s main purpose, imagine the following situation. You’re a system administrator managing an OpenStack tenant, and you have hundreds of virtual machines under your control. To complete the picture, assume that they maintain an online store application. In order to function properly, all of these virtual machines need some periodic jobs to be run on them. Say, every night at 1:00 am, you need to clean up excessively large log files located on the VMs. Usually, things like that are well addressed by using Cron, which is the most popular tool in the Unix world for running periodic jobs. However, let’s keep in mind that we’re talking about hundreds of machines! Well, experienced people may say, “What’s the big deal? We can just use pre-built VM images where we have all the Cron jobs configured as required.” This is totally fine unless it turns out that 1:00 am is not a good maintenance time for your online store since people from a different time zone are active at exactly this time. So, we need to reconfigure all of the VMs to run log clean up jobs at 3:00 am.
This is the first important aspect where Mistral comes into play. Let’s see how it can be useful here.
Figure 1 Mistral provides a single point to configure periodic cloud jobs.
Initiating the cleanup on each of the the hundreds of virtual machines can be implemented with Mistral. Basically, Mistral becomes a coordinator (or orchestrator) of the cloud tasks and provides a single place where all of the these tasks are configured. Being a coordinator means that it’s up to Mistral when and where an individual task will be run. It just signals to VMs to run something when needed and upon completion the VMs notify Mistral about the task’s success or failure.
Circling back to the example we’ve been discussing, we can say it’s fairly easy to change the schedule of cleanup jobs so that they run every night at 3:00 am because now it’s just one place instead of hundreds of individual VMs where it has to be configured or reconfigured separately.
Additionally, Mistral is going to provide a user-friendly interface to manage cloud tasks so that you don’t need to be a Unix expert to start using it.
Mistral as OpenStack Workflow-as-a-Service
A discerning reader may have noticed that so far we haven’t really covered why we also refer to Mistral as OpenStack Workflow-as-a-Service. For illustration, we will use our log file cleanup example and expand on it a bit.
Imagine that after all of the VM log files are cleaned up you’d like to get a text message about VMs whose free disk space is under 20 percent. Say that you also want to get an SMS in case something goes wrong with the file cleanup. As you can see, this is not just a plain set of tasks but rather a workflow that may go one way or another. And this is another important aspect of Mistral. Using it we can define not only individual independent tasks that should be run according to some schedule, but also sets of tasks that depend on each other. That simply means that we can tell Mistral, “After successfully cleaning up the log files, send an SMS about the disk usage; if something goes wrong with the log files, send an SMS with a description of the failure(s).” Graphically, such a setup may look like the following diagram.
Figure 2 State transitions in our sample Mistral workflow
NOTE: Mistral is not responsible for executing specific tasks such as, “Send a failure SMS.” Mistral only works with generic task descriptions it is fed. The core Mistral engine is completely agnostic when it comes to the specifics for any particular business domain. We’ll dig into details on that in the future blog posts.
Technically speaking, Mistral processes graphs of tasks and figures out the correct order of execution in each individual case. A user just needs to specify what each task depends on and upload this description to Mistral using the REST API.
If a workflow has a lot of tasks and dependencies and you don’t have Mistral, it’s hard to figure out which tasks should be first and which should follow and in what order. Therefore, one of the interesting technical things being addressed by Mistral is task dependency resolution offloading, meaning a possibility of having an application offload this responsibility to Mistral.
The Mistral team is now actively working on the implementation, and Cloud Cron is considered one of the most important use cases that MIstral intends to support. At this point, we’re also considering other use cases, such as cloud environment deployment and live migration, which we may also cover separately. As Mistral evolves, we’ll be coming up with lots of new interesting applications.
What about your story? Are you using any workflow systems? If so, how? Since Mistral is very young, there are plenty of opportunities to contribute — even by just providing use-cases — and in many ways determine its further evolution!2 comments