Hadoop Distributed File System: It is a high throughput, fault-tolerant distributed file system designed to run on commodity machines. HDFS has been designed to be easily portable from one platform to another. This facilitates widespread adoption of HDFS as a platform of choice for a large set of applications. A typical file in HDFS is gigabytes to terabytes in size. It supports tens of millions of files in a single instance.
HDFS is designed to support very large files. Applications dealing with large data sets write their data only once, but they read it one or more times and require these reads to be satisfied at streaming speeds. HDFS supports write-once-read-many semantics on files.
HDFS has a master/slave architecture. An HDFS cluster usually consists of a single NameNode or a master server and a number of DataNodes or slave nodes.
HDFS Block: A block is the smallest unit of data that can be stored or retrieved from the disk. Hadoop distributed file system stores the data in terms of fixed sized blocks. A typical block size used by HDFS is 64 MB. Internally, a large file is split/divided into one or more blocks and these blocks are stored in a set of DataNodes. Thus, an HDFS file is chopped up into 64 MB chunks, and if possible, each chunk will be distributed on multiple DataNodes in a Hadoop cluster. The NameNode maintains the metadata of all the blocks. Blocks are easy to replicate between the DataNodes and thus provide fault tolerance and high availability. Hadoop framework replicates each block across multiple nodes. The default replication factor is 3. In case of any node failure or block corruption, the same block can be read from another node.
NameNode: NameNode is the centrepiece of the hadoop system. There is only one NameNode in and it coordinates everything in a hadoop cluster. The NameNode manages the file system namespace operations like opening, closing, and renaming files and directories. It stores the metadata information of the data blocks. It also determines the mapping of blocks to DataNodes. This metadata is stored permanently on to local disk in the form of namespace image and edit log file. However the NameNode does not store this information persistently. The NameNode creates the block to DataNode mapping when it is restarted. The NameNode is a Single Point of Failure for the HDFS Cluster. If the NameNode crashes, then the entire hadoop system goes down. Client applications talk to the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file. The NameNode responds the successful requests by returning a list of relevant DataNode servers where the data resides.
DataNode: It stores the blocks of data and retrieves them. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode. DataNode instances can talk to each other, which is what they do when they are replicating data. The DataNodes also reports the blocks information to the NameNode periodically. Client applications can also talk directly to a DataNode, once the NameNode has provided the location of the data.
Secondary NameNode: This is optional node hosted on a separate machine and is responsible to copy and merge the namespace image and edit log from the NameNode periodically. In case if the name node crashes, then the namespace image stored in Secondary NameNode can be used to restart the NameNode.
Above the file systems comes the MapReduce engine, which consists of one JobTracker, to which client applications submit MapReduce jobs. The JobTracker assigns work to available Task Tracker nodes in the cluster, striving to keep the work as close to the data as possible (data locality). With a rack-aware file system, the JobTracker knows which node contains the data, and which other machines are nearby. If the work cannot be hosted on the actual node where the data resides, priority is given to nodes in the same rack. This minimizes network congestion and increases the overall throughput of the system.
JobTracker: It is responsible to handle the Job requests, submitted by the client applications. It schedules the MapReduce tasks to specific TaskTrackers (DataNodes) nodes in the cluster, ideally the nodes that have the data, or at least are in the same rack. Job Tracker also checks for any failed tasks or time out and reschedules the failed tasks to a different TaskTracker. The JobTracker is a point of failure for the Hadoop MapReduce service. If it goes down, all running jobs are halted. JobTracker can be configured to run on the NameNode machine or on a separate node.
TaskTracker: TaskTracker runs on the DataNodes. TaskTrackers are responsible to run the map, reduce or shuffle tasks assigned by the NameNode. It also reports back the status of the tasks to the NameNode. The TaskTracker spawns a separate JVM processes to do the actual work, to prevent the TaskTracker itself from failing if the running job crashes the JVM. The TaskTracker monitors these spawned processes and captures the output and exit codes. When the process finishes, the TaskTracker notifies the JobTracker about the success or failure of the job. The TaskTrackers also send out heartbeat messages to the JobTracker every few minutes, to reassure the JobTracker that it is still alive.
It is recommended not to host the DataNode, JobTracker or TaskTracker services on the same system as that of the NameNode.