Month: February 2018

Hadoop – Multi Node Cluster

This article explains the setup of the Hadoop Multi-Node cluster on a distributed environment. As the whole cluster cannot be demonstrated, we are explaining the Hadoop cluster environment using three systems (one master and two slaves); given below are their IP addresses. Hadoop Master: (hadoop-master) Hadoop Slave: (hadoop-slave-1) Hadoop Slave: (hadoop-slave-2) Follow … Continue reading Hadoop – Multi Node Cluster

An introduction to the Hadoop Distributed File System

Introduction HDFS is an Apache Software Foundation project and a subproject of the Apache Hadoop project. Hadoop is ideal for storing large amounts of data, like terabytes and petabytes, and uses HDFS as its storage system. HDFS lets you connect nodes (commodity personal computers) contained within clusters over which data files are distributed. You can … Continue reading An introduction to the Hadoop Distributed File System