The Road to Hong Kong—OpenStack Summit Speakers #3: Savanna, Elastic Hadoop on OpenStack

Sergey Lukjanov - October 30, 2013 -

This is the third in a series of posts by speakers at the Hong Kong OpenStack Summit. Today we feature the agenda for Ilya Elterman, Sergey Lukjanov, and Matthew Farrellee’s talk about provisioning and managing Hadoop clusters on OpenStack using Savanna, scheduled for November 7, from 2:40 pm to 3:20 pm.

Savanna supports two key use cases: on-demand cluster provisioning and on-demand Hadoop jobs execution Elastic Data Processing (EDP). This presentation will focus on the general project vision and EDP functionality. It will provide an introduction to the Savanna project, review the features implemented in the 0.3 version, and talk about the further plans. It will also cover the key architectural aspects and include a live demo. The demo will show how users can execute Hadoop jobs in a single click using a pre-configured Hadoop cluster template on the data stored in Swift.

The latest release of Savanna, an open-source project recently accepted into OpenStack Incubation, now provides Elastic Data Processing (EDP), a feature that allows single-click MapReduce job creation and launch. We’ll be talking about the full scope and features in version 0.3 next Thursday in Hong Kong, and we hope you will come to check it out. Here we share a few points that we will discuss there and give you an overview of a live EDP demo we will include in our talk. We can’t wait to share it.

1. Introduction to Savanna: What is it?

Savanna contains two management APIs–for Hadoop clusters and jobs management. In this part of our presentation, we will include use cases, a general project overview, and the direction in which the project is going.

We’ll talk about the following use cases:

  • Fast cluster provisioning for development and QA

  • Dedicated clusters for each tenant, which resolves security and isolation issues for Hadoop multi-tenancy

  • EDP–Utilization of the unused compute capacity for bursty workloads

  • EDP–Running Hadoop workloads in a few clicks without expertise in Hadoop ops

  • Centralized cluster management and monitoring for administrators

2. The current state of EDP–Provisioning

When we discuss the current state of EDP, we will cover:

  • The REST API for executing MapReduce jobs without exposing the details of the underlying infrastructure (similarly to AWS Elastic MapReduce [EMR]), including the integration of:

    • Pluggable data sources: Swift

    • Pig and Hive job types

    • Oozie for workflow management

  • A user-friendly UI for ad-hoc analytics queries based on Pig or Hive

3. Live demo–Provisioning, EDP, and transient clusters

We’re especially excited to share a live demo that will feature a one-click MapReduce job execution flow on data located in Swift.

4. Roadmap for the Icehouse release cycle

In this part, we’ll talk about the future plans and our short-term and long-term roadmaps. The scope for the Icehouse, the future of Savanna architecture, and integration with other OpenStack projects, will be defined in separate Design Summit sessions, held on Friday, November 8, from 1:30 pm to 4:50 pm. Please check out our Design Summit schedule and join us.

In essence, the project team is working toward the integration with the OpenStack ecosystem, particularly Heat, Ceilometer, Tempest, and DevStack. Other plans for the future include code hardening, EDP enhancement that incorporate external Hadoop Distributed File System (HDFS) and relational database management system (RDBMS) data sources, and performance testing.

5. Interesting stats

We will also share some interesting facts about our contributors, code, reviews, and community at large. Savanna’s reviewer team has grown over the last three months from 14 to 24 active members from five companies–Mirantis, Red Hat, Hortonworks, Rackspace, and IBM.

Ilya Elterman runs the Cloud Platform Engineering organization at Mirantis, responsible for creating elastic platform services on top of OpenStack IaaS. Sergey Lukjanov is the Project Technical Leader of Savanna project, and his main responsibilities are architecture design and community-related work in Savanna. Matthew Farrellee is a Principal Software Engineer and Engineering Manager at Red Hat, with over a decade of experience in distributed and computational system development and management.

banner-img
From Virtualization to Containerization
Learn how to move from monolithic to microservices in this free eBook
Download Now
Radio Cloud Native – Week of May 11th, 2022

Every Wednesday, Nick Chase and Eric Gregory from Mirantis go over the week’s cloud native and industry news. This week they discussed: Docker Extensions Artificial Intelligence shows signs that it's reaching the common person Google Cloud TPU VMs reach general availability Google buys MobileX, folds into Google Cloud NIST changes Palantir is back, and it's got a Blanket Purchase Agreement at the Department of Health and Human …

Radio Cloud Native – Week of May 11th, 2022
Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!

In the last several weeks we have released two updates to Mirantis Container Cloud - versions 2.16 and 2.17, which bring a number of important changes and enhancements. These are focused on both keeping key components up to date to provide the latest functionality and security fixes, and also delivering new functionalities for our customers to take advantage of in …

Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!
Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]

Cloud environments & Kubernetes are becoming more and more expensive to operate and manage. In this demo-rich workshop, Mirantis and Kubecost demonstrate how to deploy Kubecost as a Helm chart on top of Mirantis Kubernetes Engine. Lens users will be able to visualize their Kubernetes spend directly in the Lens desktop application, allowing users to view spend and costs efficiently …

Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]
WHITEPAPER
The Definitive Guide to Container Platforms
READ IT NOW
LIVE WEBINAR
Manage your cloud-native container environment with Mirantis Container Cloud

Wednesday, January 5 at 10:00 am PST
SAVE SEAT
LIVE WEBINAR
Istio in the Enterprise: Security & Scale Out Challenges for Microservices in k8s

Presented with Tetrate
SAVE SEAT