Big Data Training Chennai

Big Data Hadoop Training :

Big data training chennai  will provide an introduction to Hadoop. It is designed for those who are new to Big Data and Hadoop so that they can get a high level overview of this technology to build complex and powerful data processing applications.

big data training chennai

Course Outline:

  • Hadoop Overview
  • Why Hadoop
  • Hadoop Basic Concepts
  • Hadoop Ecosystem – MapReduce, Hadoop Streaming, Hive, Pig, Flume, Sqoop, Hbase, Oozie, Mahout
  • Where Hadoop fits in the Enterprise
  • Review use cases

Hadoop Fundamentals and Architecture

  • Why Hadoop, Hadoop Basics and Hadoop Architecture
  • HDFS and Map Reduce

MapReduce Programming

  • Fundamentals
  • Anatomy of MapReduce Job Run
  • Job Monitoring, Scheduling
  • Sample Code Walk Through
  • Hadoop API Walk Through

MapReduce Features

  • Counters, Exercise
  • Map Side Join, Exercise
  • Reduce Side Join, Exercise
  • Sorting, Exercise

Hadoop Ecosystems Overview

  • Hive
  • Hbase
  • ZooKeeper
  • Pig
  • Mahout
  • Flume
  • Sqoop
  • Oozie

Hive Introduction

  • Why Hive?
  • Compare vs SQL
  • Use Cases

Hive Architecture – Building Blocks

  • Hive CLI and Language (Exercise)
  • HDFS Shell
  • Hive CLI
  • Data Types
  • Hive Cheat-Sheet
  • Data Definition Statements
  • Data Manipulation Statements
  • Select, Views, GroupBy, SortBy/DistributeBy/ClusterBy/OrderBy, Joins
  • Built-in Functions
  • Union, Sub Queries, Sampling, Explain

Pig Introduction

  • Position Pig in Hadoop ecosystem
  • Why Pig and not MapReduce
  • Simple example (slides) comparing Pig and MapReduce
  • Who is using Pig now and what are the main use cases

Pig Architecture

  • Discuss high level components of Pig

Pig Grunt – How to Start and Use

Pig Latin Programming

  • Data Types
  • Cheat sheet
  • Schema
  • Expressions
  • Commands and Exercise
  • Load, Store, Dump, Relational Operations, Foreach, Filter, Group, Order By, Distinct, Join, Cogroup, Union, Cross, Limit, Sample, Parallel

Hbase Introduction

  • What it is, what it is not, its history and common use-cases
  • Hbase Client – Shell, exercise

Hbase Architecture

  • Building Components
  • Storage, B+ tree, Log Structured Merge Trees
  • Region Lifecycle
  • Read/Write Path

Hbase Schema Design

  • Introduction to hbase schema
  • Column Family, Rows, Cells, Cell timestamp
  • Deletes
  • Exercise – build a schema, load data, query data

Hbase Operations, cluster management

Performance Tuning

Advanced Features

NoSQL Introduction

  • Traditional RDBMS approach
  • NoSQL introduction
  • Hadoop & Hbase positioning

Course Pre-Requisites:

Familiarity with Data Warehouse, Database systems and Distributed Systems is assumed. Existing knowledge of Hadoop is not required.

 

Click here to submit your review.


Submit your review
* Required Field