Select Course

Big Data & HADOOP

Duration: 60 Hrs
Prerequisites: Basic linux and java knowledge
Recommended Next Course:

Introduction to HADOOP and HDFS (Hadoop Distributed File System)

  • Design of HDFS
  • HDFS Concepts
  • Command Line Interface
  • Hadoop File Systems
  • Java Interface
  • Data Flow (Anatomy of a File Read, Anatomy of a File Write, Coherency Model)
  • Parallel Copying with DISTCP
  • Hadoop Archives

Understanding Pseudo Cluster Environment

  • Cluster Specification
  • Hadoop Configuration (Configuration Management, Environment Settings, Important Hadoop Daemon Properties, Hadoop Daemon Addresses and Ports, Other Hadoop Properties)
  • Basic Linux and HDFS commands

Understanding - Map-Reduce Basics and Map-Reduce Types and Formats

  • Hadoop Data Types
  • Functional - Concept of Mappers
  • Functional - Concept of Reducers
  • The Execution Framework
  • Concept of Partitioners
  • Functional - Concept of Combiners
  • Distributed File System
  • Hadoop Cluster Architecture
  • MapReduce Types
  • Input Formats (Input Splits and Records, Text Input, Binary Input, Multiple Inputs)
  • OutPut Formats (TextOutput, BinaryOutPut, Multiple Output)


  • Installing and Running Pig
  • Grunt
  • Pig's Data Model
  • Pig Latin
  • Developing & Testing Pig Latin Scripts
  • Writing Evaluation
  • Filter
  • Load & Store Functions


  • Hive Architecture
  • Running Hive
  • Comparison with Traditional Database (Schema on Read Versus Schema on Write
  • Updates
  • HiveQL (Data Types, Operators and Functions)
  • Tables (Managed Tables and External Tables, Partitions and Buckets, Storage Formats, Importing Data, Altering Tables, Dropping Tables)
  • Querying Data (Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries & Views, Map and Reduce site Join to optimize Query)
  • User Defined Functions
  • Appending Data into existing Hive Table
  • Custom Map/Reduce in Hive


  • Introduction
  • Client API - Basics
  • Client API - Advanced Features
  • Client API - Administrative Features
  • Available Client
  • Architecture
  • MapReduce Integration
  • Advanced Usage
  • Advance Indexing
  • The Zookeeper Service (Data Modal, Operations, Implementation, Consistency, Sessions, States)
  • Building Applications with Zookeeper (Zookeeper in Production)
  • Database Imports
  • Working with Imported Data
  • Importing Large Objects
  • Performing Exports
  • Exports - A Deeper Look

Live Project

  • App. Development.
  • Running search Query.