Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
learn:bigdata:hadoop [2014/08/07 14:17] – [Hadoop] yehudalearn:bigdata:hadoop [2022/01/03 16:03] (current) – external edit 127.0.0.1
Line 7: Line 7:
   * Intelligent   * Intelligent
  
-Main tools are [[HDFS]] (HaDoop File System) and [[MapReduce]]+Main tools are [[HDFS]] (HaDoop File System)[[MapReduce]] and [[YARN]]
  
-===== Eco system =====+===== Installing Hadoop ===== 
 +Single node [[Cluser]] 
 + 
 +  * Standalon mode - all hadoop components run under single [[JVM]] 
 +  * Pesodo Destributed - each deamon runs under seperated [[JVM]] 
 +  * Fully Destributed - each deamon runs under seperated maching 
 + 
 + 
 +see [[Install Hadoop eco system (single mode)]] 
 + 
 +===== Hadoop Technology stack ===== 
 +see more at http://incubator.apache.org/ 
 +===Data Access ====
 [[Hive]], [[Pig]] [[Hive]], [[Pig]]
 +
 +==== Data Storage ====
 +[[Hbase]], [[Cassandra]]
 +
 +==== Vizualization ====
 +[[HCatalog]], [[Lucene]], [[Hama]], [[Crunch]]
 +
 +==== Data serialization ====
 +[[Avro]], [[Thrift]]
 +
 +==== Data Integration ====
 +[[Drill]], [[Mahout]]
 +
 +==== Data Integration ====
 +[[Sqoop]], [[Flume]], [[Chukwa]]
 +
 +==== Managment, Monitoring ====
 +[[Ambari]], [[Zookeeper]], [[Oozie]]
 +
 +==== More ====
 +[[HDT]], [[Konx]], [[Spark]]
 +===== Use-cases =====
 +  * New-York Times - Want to convert 4 TB of articales to PDF. thay did it with AWS less then 24 hours and it cost them about $240! 
learn/bigdata/hadoop.1407421044.txt.gz · Last modified: (external edit)
Back to top
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0