====== HDFS (HaDoop File System) ====== HDFS is one of utils that included at [[Hadoop]] framework. * HDFS knows to handle large file (Ex. 1ptb) and split it to [[Data block|Data blocks]] and distrebute it over [[Cluster]]. * HDFS has [[wp>Fault tolerance]] - [[HDFS]] replicate same [[Data block]] on X [[Data node|Data nodes]]. X is [[Replication factor]] * HDFS Master / Slaves architecture ===== HDFS Features ===== * Rack awarness - What happends if entire rack is lost ? * Reliable storage * High throughput ===== Tools ===== * [[dfsadmin]] * [[fs shell]] * Web-UIs ===== Nodes ===== ==== Data world ==== * [[Name node]] - FS Ops, Block Mapping * [[Secondary name node]] - Checkpoint Ops * [[Data node]] - Block Ops, Replications ==== MapReduce world ==== See: [[MapReduce]] * Master: [[Job tracker]] - controller of [[Task tracker|Task trackers]] * [[Task tracker]]