I’ve got two teenage boys who love reading survival manuals. On weekends they can’t wait to go out on camping excursions with the bare minimum of equipment to get by while defaulting to ingenuity and skills to overcome all challenges. They’re not doing it because they like to be miserable; they’re doing it because they like the challenge of making sure they have the right tools, knowledge and level of preparedness, and that they make the right decisions — in other words, all of the factors that make the difference between a great experience and a poor one when it comes to accomplishing your objectives in the most efficient way.
Things don’t change when you grow up and go into the office instead of the woods. As you prepare your day in IT operations, keeping the infrastructure and operational environments that support the application deployment needs running smoothly also requires both skills and the right set of tools so you don’t spend your weekends and evenings fixing things that didn’t have to break in the first place.
Just as having a Swiss army knife, a flashlight, sunscreen, a hammock, a raincoat or a fishing rod available can make all the difference in the woods, at work, you also need to think about having the right tools you have at your disposal. You’ve trained hard, you are a hard-core professional, and you deserve to have the best tools you can get for the job; which brings me to today’s topic, Mirantis Cloud Platform.
Here at Mirantis, we’re pretty excited about what’s emerging for those of us assisting IT professionals in your quest to support all the application needs of your customers, internal and external.
Today I wanted to cover a few new options from Mirantis you may want to consider if you are looking to enhance and expand your breadth of capabilities, including capacity monitoring, increased robustness, orchestration and devops, and overall cloud health.
Mirantis StackLight, which is our 100% open-source Operations Support System (OSS) for continuous monitoring and maximum availability now includes a new DevOps Portal that provides a holistic view of your Mirantis Cloud Platform (MCP) environment.
New DevOps Portal
This new aggregated toolset significantly reduces the complexity of Day 2 cloud operations through services and dashboards around a high degree of automation, availability statistics, resource utilization, capacity utilization, continuous testing, logs, metrics, and notifications. What’s more, the new DevOps Portal enables cloud operators to manage larger clouds with greater uptime without having to convert their entire staff into open source developers.
The web UI offers services that include cloud intelligence, capacity management, and a subset of the tools made available within Simian Army.
Included services within the DevOps portal
Let’s take a look at each one of the components in detail.
MCP enables you to ensure available capacity by providing a live look at what’s going on inside your cloud using the Cloud Intelligence Service and Cloud Capacity Management.
Cloud Intelligence Service: This service collects and stores data from MCP services such as OpenStack, Kubernetes, bare metal, and so on. You can then query the data as part of use cases such as cost visibility, business insights, cost comparison, chargeback/showback, cloud efficiency optimization, and IT benchmarking. Operators can interact with the resource data using a wide range of queries, such as searching for the last VM rebooted, total memory consumed by the cloud, number of containers that are operational, and so on.
Cloud Capacity Management: This dashboard provides point in time resource consumption data for OpenStack by displaying parameters such as total CPU utilization, memory utilization, disk utilization, and number of hypervisors. This dashboard is based on data collected by the Cloud Intelligence Service, and can be used for cloud capacity management and other business optimization aspects.
With this module you can evaluate security and improve utilization.
Security Monkey & Janitor Monkey: In this release, MCP includes Security Monkey and Janitor Monkey (and their respective dashboards), two of the multiple tools that compose Simian Army. Simian Army is a growing set of open source tools originally created by Netflix to run continuous tenant level tests on a production cloud to make it more antifragile. The closest traditional IT analogy to the Simian Army is online diagnostics. Security Monkey runs tests that track and evaluate security-related tenant changes and configurations. Janitor Monkey constantly looks to reclaim unused tenant resources for improved cloud utilization.
Orchestration and DevOps
Here we have a great set of tools to help you automate workflow of jobs in response to specific events among other things.
Runbooks Automation: Clouds are simply too complex to be managed using traditional manual processes. Instead, they require a high degree of automation in which events or time durations trigger the execution of specific jobs. The Runbooks Automation service, based on Rundeck, accomplishes this by enabling operators to create a workflow of jobs that get executed at specific time intervals or in response to specific events (such as policy-driven events). For example, operators can now automate periodic backups, weekly report creation, specific actions in response to a failed Cinder volume, and so on. Note, however, that Runbooks Automation is not a lifecycle management tool; it’s not appropriate for reconfiguring, scaling, or updating MCP itself. (LCM for an MCP cloud is exclusively performed with DriveTrain, see below).
DriveTrain: This toolchain provides access to relevant CI/CD LCM tooling such as Git, Gerrit, Jenkins, Artifactory, etc., to automate the delivery of change controls to the infrastructure and its services. This includes scaling the cloud, patching software packages, and full environment upgrades.
You can find another great set of tools to gain broader monitoring capabilities, additional metrics and a higher level of alerts and notifications in the Cloud Health section.
Cloud Health Service: This service collects availability results for all OpenStack services and failed customer (tenant) interactions (FCI) for a subset of those services. These metrics are displayed so that operators can see both point-in-time health status and trends over time.
Metrics: All metrics collected by Prometheus (see below) are visualized through Grafana dashboards.
Logs: Logs for various MCP services are aggregated in Elasticsearch and visualized through Kibana dashboards.
Additionally, StackLight now expands monitoring coverage within Kubernetes, containers and Ceph, as well as deeper Kubernetes log processing. The architecture has undergone a major evolution with the inclusion of a monitoring and alerting solution built using the open source Prometheus project. Prometheus is a mature open source monitoring system, now maintained as an initiative of the Cloud Native Computing Foundation (CNCF), and approaches the age-old monitoring and alerting problem with a web-scale architecture utilizing a dimensional data model, powerful query engine, Grafana visualization integration, efficient storage, and precise alerting. Prometheus is also easy to operate and provides numerous third-party integrations. StackLight has evolved to use Telegraf to collect metrics and Prometheus Alertmanager for notifications/alerts. StackLight also provides InfluxDB, using it for long-term, resilient metrics storage and as a back-end for Ceilometer to enable Heat-based auto-scaling.
Notifications Service: A notifications dashboard displays all alerts/notifications generated by Prometheus Alertmanager. This screen replaces the previous Nagios tool in StackLight. Alertmanager enables MCP customers to configure where alerts are going to be sent — support is provided for many kinds of endpoints, including email, SMS, PagerDuty, and others.
All this new integration and capabilities are specifically designed for MCP to provide a view into the open cloud with optimized collectors, dashboards, alarms, faults and event correlation.
In other words, even though open cloud may feel like the Wild, Wild, West, there’s no need to rough it when supporting the challenging needs of your business. Once you gain better insights that enable you to minimize unpredictability and better manage your work environment, you will be able to leave unplanned surprises to your weekend outings. Here at Mirantis, we aim to provide peace of mind with the right set of tools in support of your application deployment needs; leaving it up to you to decide how to spend your weekends.