Event Details

Please note: All times listed below are in Central Time Zone

<< Go back

Architecture and Best Practices to Deploy Hadoop and Spark Clusters with Sahara

Big Data

Using Big Data environments as analytics engines to gain useful insight on massive amounts of data in a manageable and timely manner can be a complex challenge. Using Sahara as the orchestration tool helps cloud architects provide high-performance production Hadoop and Spark environments enabling private cloud deployments that are easy for data scientists to manage and consume. We will provide a reference architecture for a Hadoop cluster using Intel based systems and Sahara focused on performance and ease of use to answer questions regarding security, storage, and supporting future releases without re-installing.

Attendees will learn:

- Hadoop configurations to best utilize CPU and I/O resources

- Storage Best practices

- New ways to orchestrate a cluster with various Hadoop and Spark workloads

- Published performance benchmark results

- LIfe cycle management

What can I expect to learn?

Attendees are expected to have understanding of analytics workloads, some understanding of the Hadoop framework, and knowledge of storage solutions for OpenStack. While attendees with just this knowledge will be able to make use of the information provided, mainly around the return on investment, it is recommended to have some understanding of OpenStack internals.

Attendees will receive a complete overview of:

- Hadoop settings and configuration to best utilize CPU and I/O resources

- Best practices for selection of storage settings

- New ways to submit jobs and orchestrate clusters with various Hadoop distributions

- Spark workloads orchestration

- Published results on different benchmarks used to validate cluster performance

- Life cycle management of deployments

Wednesday, April 27, 4:30pm-5:10pm (9:30pm - 10:10pm UTC)

Austin Convention Center - Level 4 - Ballroom F

View video

Difficulty Level: Intermediate

Tags: Architect Ops Telecom Enterprise Project Technical Lead (PTL) Upstream Dev Cinder Sahara

Sergey Lukjanov

Mirantis

Sergey is the head of Mirantis Application Platform (MAP, Cloud Native Continuous Delivery based on Spinnaker) initiative at Mirantis. He is responsible for architecture, design and executions of MAP. Previously Sergey was running OpenStack Containerized Control Plane project in Mirantis and was the Project Technical Lead of OpenStack Data Processing program ("Sahara") and OpenStack... FULL PROFILE

Paul Work

Cluster Infrastructure Manager at Intel Corporatio

Paul Work has been with Intel Corporation for 22 years, beginning as an application engineer in the HPC market segment, optimizing software and supporting customers at Wright Labs, Sandia National Labs, and Lawrence Livermore National Labs. He now manages the Cluster Infrastructure Lab in Intel’s Cloud Products Group, supporting software research and development into applications... FULL PROFILE

Nikita Konovalov

Mirantis

Nikita has been working with OpenStack Data Processing (Sahara) from the early days of the project. He has been implementing the initial version of Sahara UI which then has been accepted to main OpenStack Dashboard codebase. He added support for Sahara benchmarking for Rally project and is now responsible for Sahara scale testing activities in Mirantis. Nikita has also been participating in... FULL PROFILE

Event Details

Registration Opening Soon