Table of Contents
HDFS (HaDoop File System)
HDFS Features
Tools
Nodes
Data world
MapReduce world
HDFS (HaDoop File System)
HDFS is one of utils that included at
Hadoop
framework.
HDFS knows to handle large file (Ex. 1ptb) and split it to
Data blocks
and distrebute it over
Cluster
.
HDFS has
Fault tolerance
-
HDFS
replicate same
Data block
on X
Data nodes
. X is
Replication factor
HDFS Master / Slaves architecture
HDFS Features
Rack awarness - What happends if entire rack is lost ?
Reliable storage
High throughput
Tools
dfsadmin
fs shell
Web-UIs
Nodes
Data world
Name node
- FS Ops, Block Mapping
Secondary name node
- Checkpoint Ops
Data node
- Block Ops, Replications
MapReduce world
See:
MapReduce
Master:
Job tracker
- controller of
Task trackers
Task tracker