Home » Enterprise Java » Apache Hadoop » Page 4

Apache Hadoop

Apache Hadoop Wordcount Example

In this example, we will demonstrate the Word Count example in Hadoop. Word count is the basic example to understand the Hadoop MapReduce paradigm in which we count the number of instances of each word in an input file and gives the list of words and the number of instances of the particular word as an output. 1. Introduction Hadoop ...

Read More »

Apache Hadoop Distributed File System Explained

In this example, we will discuss Apache Hadoop Distributed File System(HDFS), its components and the architecture in detail. HDFS is one of the core components of Apache Hadoop ecosystem also.                     Table Of Contents 1. Introduction 2. HDFS Design 2.1 System failures 2.2 Can handle large amount of data 2.3 Coherency ...

Read More »

How to Install Apache Hadoop on Ubuntu

In this example, we will see the details of how to install Apache Hadoop on an Ubuntu system. We will go through all the required steps starting with the required pre-requisites of Apache Hadoop followed by how to configure Hadoop and we will finish this example by learning how to insert data into Hadoop and how to run an example ...

Read More »

Apache Hadoop FS Commands Example

In this example, we will go through most important commands which you may need to know to handle Hadoop File System(FS). We assume the previous knowledge of what Hadoop is and what Hadoop can do? How it works in distributed fashion and what Hadoop Distributed File System(HDFS) is? So that we can go ahead and check some examples of how ...

Read More »

Apache Hadoop Zookeeper Example

In this example, we will explore Apache Zookeeper, starting with the introduction and then followed by the steps to setup the Zookeeper and to get it up and running. 1. Introduction Apache Zookeeper is the building block of distributed systems. When a distributed system is designed there is always a need of developing and deploying something which can coordinate through ...

Read More »

Apache Hadoop Cluster Setup Example (with Virtual Machines)

Table Of Contents 1. Introduction 2. Requirements 3. Preparing Virtual Machine 3.1 Creating VM and Installing Guest OS 3.2 Installing Guest Additions 4. Creating Cluster of Virtual Machines 4.1 VM Network settings 4.2 Cloning the Virtual Machine 4.3 Testing the network IPs assigned to VMs 4.4 Converting to Static IPs for VMs 5. Hadoop prerequisite settings 5.1 Creating User 5.2 ...

Read More »

Apache Hadoop Distcp Example

In this example, we are going to show you how to copy large files in inter/intra-cluster setup of Hadoop using distributed copy tool. 1. Introduction DistCP is the shortform of Distributed Copy in context of Apache Hadoop. It is basically a tool which can be used in case we need to copy large amount of data/files in inter/intra-cluster setup. In ...

Read More »

Hadoop Hello World Example

1. Introduction In this post, we feature a comprehensive Hadoop Hello World Example. Hadoop is an Apache Software Foundation project. It is the open source version inspired by Google MapReduce and Google File System. It is designed for distributed processing of large data sets across a cluster of systems often running on commodity standard hardware. Hadoop is designed with an assumption ...

Read More »