Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| learn:bigdata:hadoop [2014/08/07 12:23] – created yehuda | learn:bigdata:hadoop [2022/01/03 16:03] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Hadoop ====== | ====== Hadoop ====== | ||
| - | Hadoop is framework that has set of tools to distrebute and proccess data over | + | Hadoop is framework that has set of tools to distrebute and proccess data over clasters. |
| - | main tools are [[HDFS]] (HaDoop File System) and [[MapReduce]] | + | |
| + | * Scalable | ||
| + | * Flexible | ||
| + | * Fault-tolerant | ||
| + | * Intelligent | ||
| + | |||
| + | Main tools are [[HDFS]] (HaDoop File System), [[MapReduce]] and [[YARN]] | ||
| + | |||
| + | ===== Installing Hadoop ===== | ||
| + | Single node [[Cluser]] | ||
| + | |||
| + | * Standalon mode - all hadoop components run under single [[JVM]] | ||
| + | * Pesodo Destributed - each deamon runs under seperated [[JVM]] | ||
| + | * Fully Destributed - each deamon runs under seperated maching | ||
| + | |||
| + | |||
| + | see [[Install Hadoop eco system (single mode)]] | ||
| + | |||
| + | ===== Hadoop Technology stack ===== | ||
| + | see more at http:// | ||
| + | ==== Data Access ==== | ||
| + | [[Hive]], [[Pig]] | ||
| + | |||
| + | ==== Data Storage ==== | ||
| + | [[Hbase]], [[Cassandra]] | ||
| + | |||
| + | ==== Vizualization ==== | ||
| + | [[HCatalog]], | ||
| + | |||
| + | ==== Data serialization ==== | ||
| + | [[Avro]], [[Thrift]] | ||
| + | |||
| + | ==== Data Integration ==== | ||
| + | [[Drill]], [[Mahout]] | ||
| + | |||
| + | ==== Data Integration ==== | ||
| + | [[Sqoop]], [[Flume]], [[Chukwa]] | ||
| + | |||
| + | ==== Managment, Monitoring ==== | ||
| + | [[Ambari]], [[Zookeeper]], | ||
| + | |||
| + | ==== More ==== | ||
| + | [[HDT]], [[Konx]], [[Spark]] | ||
| + | ===== Use-cases ===== | ||
| + | * New-York Times - Want to convert 4 TB of articales to PDF. thay did it with AWS less then 24 hours and it cost them about $240! | ||