Austin Summit Preview: Architecture and Best Practices to Deploy Hadoop and Spark Clusters with Sahara

As the Austin OpenStack Summit approaches, we’ll be bringing you previews of some of the sessions we’re participating in. First up: our session on deploying big data solutions with Hadoop and OpenStack.

Hadoop is the world standard for big data platforms, built with performance and data consistency in mind. While the Hadoop ecosystem today is used by major enterprises all over the world, that doesn’t mean it’s easy to deploy. Hadoop has always been considered to be a bare metal solution, and building huge clusters requires a lot of patience and expertise from DevOps teams.

Deploying Hadoop to an OpenStack private cloud instead of bare metal raises questions and concerns, including:

  • Can Hadoop keep its performance on VMs?
  • How reliable is virtualized storage for HDFS data?
  • Will the cloud reduce deployment complexity or it is going to bring its own caveats?

The simple answer to all of these questions is yes, you can. To get the details, join us for Architecture and Best Practices to Deploy Hadoop and Spark Clusters with Sahara at the OpenStack Summit in Austin on Wednesday, April 27.  I’ll be speaking, along with Sergey Lukjanov, the Sahara PTL, and Paul Work, who manages the Cluster Infrastructure Lab in Intel’s Cloud Products Group. We’ll share what we’ve learned from performance tests in the Intel high performance lab with Mirantis OpenStack installed there. We’ll compare bare metal Hadoop to the virtualized cluster installed by the OpenStack Sahara service and share Hadoop settings and configurations to best utilize CPU and I/O resources, as well as best practices for selection of storage settings, and more. We hope to see you there!

Add this session to your summit calendar: Architecture and Best Practices to Deploy Hadoop and Spark Clusters with Sahara

If you haven’t registered for the OpenStack Summit yet, register here.

Subscribe to Our Newsletter

Latest Tweets

Suggested Content

LIVE DEMO
Mirantis Cloud Platform
WEBINAR
Machine Learning in the Datacenter