- Hadoop Distributed File System - Datanode, Namenode, Secondary Namenode
- MapReduce - Map phase, Reduce phase
- Yarn - Node manager, Resource manager
- Hive - MetaStore, Driver, Query compiler, Hive server
- Pig
- HBase - HBase server, Region server
- Mahout
- Zookeeper
- Oozie
- Sqoop
- Flume - Source, Channel, Sink
- Ambari
- Apache Drill
- Apache Spark
- Solr And Lucene
- Scala
- Presto
Scikit and Introduction to Hadoop
- Introduction to Scikit-Learn
- Inbuilt Algorithms for Use
- What is Hadoop and why it is popular
- Distributed Computation and Functional Programming
- Understanding MapReduce Framework Sample MapReduce Job Run
Hadoop and Python
- PIG and HIVE Basics
- Streaming Feature in Hadoop
- Map Reduce Job Run using Python
- Writing a PIG UDF in Python
- Writing a HIVE UDF in Python
- Pydoop and MRjob Basics
HADOOP
- Big Data and Hadoop Introduction
- What is Big Data and Hadoop?
- Challenges of Big Data
- Traditional approach Vs Hadoop
- Hadoop Architecture
- Distributed Model
- Block structure File System
- Technologies supporting Big Data
- Replication
- Fault Tolerance
- Why Hadoop?
- Hadoop Eco-System
- Use cases of Hadoop
- Hadoop Ecosystem
- Fundamental Design Principles of Hadoop
- Comparison of Hadoop Vs RDBMS
Hadoop Cluster Architecture
- Hadoop Cluster and Architecture
- 5 Daemons
- Hands-On Exercise
- Typical Workflow
- Hands-On Exercise
- Writing Files to HDFS
- Hands-On Exercise
- Reading Files from HDFS
- Hands-On Exercise
- Rack Awareness
- Before Map Reduce
Module-9
- Joins & Sub queries, Views
- Integration, Data manipulation with Hive
- User Defined Functions
- Appending Data into existing Hive Table
- Static partitioning vs dynamic partitioning
No comments:
Post a Comment