Big Data Hadoop Developer Training Chennai

Big Data Hadoop Developer Training Chennai

Hadoop – The Scalable Data Framework

The Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Gartner defines Big Data as high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. According to IBM, 80% of data captured today is unstructured, from sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals, to name a few. All of this unstructured data is Big Data. Organizations are discovering that important predictions can be made by structuring and analyzing this data. The amount of information in the world is now measured in zettabytes.

What participants will learn?

The attendees will learn below topics through lectures and hands-on exercises
– Understand Big Data & Hadoop Ecosystem
– Hadoop Distributed File System – HDFS
– Use Map Reduce API and write common algorithms
– Best practices for developing and debugging map reduce programs
– Advanced Map Reduce Concepts & Algorithms
– Hadoop Best Practices & Tip and Techniques
– Managing and Monitoring Hadoop Cluster
– Importing and exporting data using Sqoop
– Leverage Hive & Pig for analysis

Intended Audience: Architects and developers, who wish to write, build and maintain Apache Hadoop jobs.

Course Prerequisites: The participants should have basic understanding of linux.

  • Big Data Characteristics, Challenges with traditional system

Hadoop Overview & it’s Ecosystem

  • Anatomy of Hadoop Cluster, Installing and Configuring Hadoop

  • Hands-On Exercise

HDFS – Hadoop Distributed File System

  • HDFS Architecture, Name Nodes, Data Nodes and Secondary Name Node

  • Hands-On Exercise

Map Reduce Anatomy

  • How Map Reduce Works?

  • The Mapper & Reducer, , Data Type, Input& Output Formats

Developing Map Reduce Programs

  • Setting up Eclipse Development Environment, Creating Map Reduce Projects, Debugging and Unit Testing

  • Developing a map reduce algorithm on real world scenario

  • Hands On Exercises

Advanced Map Reduce Concepts

  • Combiner, Partitioner, Counter, Compression, Setup and teardown, Speculative Execution, Zero Reducer and Distributed Cache

Advanced Map Reduce Algorithms

  • Sorting, Searching , Multiple Inputs, Chaining multiple jobs

  • Joins, Handling Binary & Unstructured data

 

Advanced Tips & Techniques

  • Determining optimal number of reducers, skipping bad records

  • Partitioning into multiple output files & Passing parameters to tasks

  • Hadoop Cluster sizing and capacity planning

Monitoring & Management of Hadoop

  • Managing HDFS with Tools like fsck and dfsadmin

  • Using HDFS & Job Tracker Web UI

  • Routine Administration Procedures

  • Hands On Exercises

Sqoop

  • Importing and Exporting data from using RDBMS

  • Hands On Exercises – Import and Export

Hive

  • Hive Basics, Internal & External Tables, Partitioning, Buckets

  • Writing queries – Joins, Union, Dynamic partitioning, Sampling

  • Hands On Exercise – Structured data analysis

Pig

  • Pig Basics, Loading data files

  • Writing queries – SPLIT, FILTER, JOIN, GROUP, SAMPLE, ILLUSTRATE etc.

  • Hands On Exercise – Semi-structured Data Analysis

Setting up a Hadoop Cluster ( 2 Nodes )

  • Demo by Instructor

Hadoop Best Practices

Register Now!

Learn Big Data from Big Data Solutions Architects!

Reach us to Enroll! 100% Placements
Key Features -

  • Cloud Server Access
  • Training = Enterprise Scale
  • Advanced Technology Coverage + PoC Project Work
  • 24/7 Technical Support

Call: +91 99627 74612

 

 

Click here to submit your review.


Submit your review
* Required Field

, , ,