Table of Contents

Hadoop

Hadoop is framework that has set of tools to distrebute and proccess data over clasters.

Main tools are HDFS (HaDoop File System), MapReduce and YARN

Installing Hadoop

Single node Cluser

see Install Hadoop eco system (single mode)

Hadoop Technology stack

see more at http://incubator.apache.org/

Data Access

Hive, Pig

Data Storage

Hbase, Cassandra

Vizualization

HCatalog, Lucene, Hama, Crunch

Data serialization

Avro, Thrift

Data Integration

Drill, Mahout

Data Integration

Sqoop, Flume, Chukwa

Managment, Monitoring

Ambari, Zookeeper, Oozie

More

HDT, Konx, Spark

Use-cases