Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
learn:bigdata:hdfs [2014/08/07 12:36] – yehuda | learn:bigdata:hdfs [2022/01/03 16:03] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== HDFS (HaDoop File System) ====== | ====== HDFS (HaDoop File System) ====== | ||
HDFS is one of utils that included at [[Hadoop]] framework. | HDFS is one of utils that included at [[Hadoop]] framework. | ||
- | * HDFS knows to handle large file (Ex. 1ptb) and split it to [[Data blocks]] and distrebute it over [[Cluster]]. | + | * HDFS knows to handle large file (Ex. 1ptb) and split it to [[Data block|Data blocks]] and distrebute it over [[Cluster]]. |
- | * HDFS has [[Fault tolerance]] - [[HDFS]] replicate same [[Data block]] on X [[Data node|Data nodes]]. X is [[Replication factor]] | + | * HDFS has [[wp>Fault tolerance]] - [[HDFS]] replicate same [[Data block]] on X [[Data node|Data nodes]]. X is [[Replication factor]] |
+ | * HDFS Master / Slaves architecture | ||
+ | |||
+ | ===== HDFS Features ===== | ||
+ | * Rack awarness - What happends if entire rack is lost ? | ||
+ | * Reliable storage | ||
+ | * High throughput | ||
+ | ===== Tools ===== | ||
+ | * [[dfsadmin]] | ||
+ | * [[fs shell]] | ||
+ | * Web-UIs | ||
+ | |||
+ | ===== Nodes ===== | ||
+ | ==== Data world ==== | ||
+ | * [[Name node]] - FS Ops, Block Mapping | ||
+ | * [[Secondary name node]] - Checkpoint Ops | ||
+ | * [[Data node]] - Block Ops, Replications | ||
+ | ==== MapReduce world ==== | ||
+ | See: [[MapReduce]] | ||
+ | * Master: [[Job tracker]] - controller of [[Task tracker|Task trackers]] | ||
+ | * [[Task tracker]] |