- Hadoop Ecosystem: It consists of Hadoop core project(HDFS, MapReduce) and other open source projects (Hive, Pig, Imapala, Sqoop etc) working on top of Hadoop to help in data analysis
- PIG: High level language to analyses large data set, internal it uses MR
- Hive: Offers a SQL like language on top of MR
- Imapala: develops as a way to query data in hadoop like SQL with out using MR, with low latency compare to Hive.
- Sqoop: To migrate data from RDBMS to Hadoop
- Flume: To move data from external resource (logs etc) to Hadoop
- Hbase: real time db on hdfs
- Hue: GUI Frontend to cluster
- Oozie: workflow management
- Mahoot: Machine learning library
Wednesday, April 15, 2015
Hadoop Terminology
Subscribe to:
Posts (Atom)