Hadoop Developer/Admin Training - Course Content

Hadoop Architecture
  • Introduction to Hadoop
  • Parallel Computer vs. Distributed Computing
  • How to install Hadoop on your system
  • How to install Hadoop cluster on multiple machines
  • Hadoop Daemons introduction: NameNode, DataNode, JobTracker, TaskTracker
  • Exploring HDFS (Hadoop Distributed File System) Exploring the HDFS Apache Web UI
  • NameNode architecture (EditLog, FsImage, location of replicas) Secondary NameNode architecture
  • DataNode architecture
MapReduce Architecture
  • Exploring JobTracker/TaskTracker
  • How a client submits a Map-Reduce job
  • Exploring Mapper/Reducer/Combiner
  • Shuffle: Sort & Partition
  • Input/output formats
  • Job Scheduling (FIFO, Fair Scheduler, Capacity Scheduler) Exploring the Apache MapReduce Web UI
Hadoop Developer Tasks
  • Writting a map-reduce programme
  • Reading and writing data using
  • Java Hadoop Eclipse integration
  • Mapper in details
  • Reducer in details
  • Using Combiners
  • Reducing Intermediate Data with Combiners
  • Writing Partitioners for Better Load
  • Balancing Sorting in HDFS
  • Searching in HDFS
  • Indexing in HDFS
  • Hands-On Exercise
Hadoop Administrative Tasks
  • Routine Administrative Procedures
  • Understanding dfsadmin and mradmin Block Scanner, Balancer
  • Health Check & Safe mode
  • DataNode commissioning/decommissioning
  • Monitoring and Debugging on a production
  • cluster NameNode Back up and Recovery
  • ACL (Access control list) Upgrading Hadoop
HBase Architecture
  • Introduction to Hbase
  • HBase vs. RDBMS
  • Exploring HBase Master & region server
  • Column Families and Regions
  • Basic Hbase shell commands
Hive Architecture
  • Introduction to Hive
  • HBase vs Hive
  • Installation of Hive
  • HQL (Hive query language)
  • Basic Hive commands
Pig Architecture
  • Introduction to Pig
  • Installation of Pig on your system
  • Basic Pig commands
  • Hands-On Exercise
Sqoop Architecture
  • Introduction to Sqoop
  • Installation of Sqoop on your system
  • Import/Export data from RDBMS to HDFS
  • Import/Export data from RDBMS to HBase
  • Import/Export data from RDBMS to Hive
  • Hands-On Exercise
Mini Project / POC ( Proof of Concept )
  • Facebook-Hive POC
  • Usages of Hadoop/Hive @ Facebook
  • Static & dynamic partitioning
  • UDF ( User defined functions )