Home » Enterprise Java » Apache Hadoop » Page 2

Apache Hadoop

Hadoop Getmerge Example

In this example, we will look at merging the different files into one file in HDFS (Hadoop Distributed File System) and Apache Hadoop. Specifically the getmerge command. 1. Introduction Merging is one of the tasks which is required a lot of times in Hadoop and most of the times, the number of files is large or the size of files ...

Read More »

Is Hadoop a database?

In this article we will try to address the one of the most asked question by beginners in the Apache Hadoop and Big Data ecosystem. That is Is Hadoop a Database? or more specifically Is Hadoop Relational Database?               1. Is Hadoop a database No Hadoop is not a database, to understand the difference ...

Read More »

Difference Between Bigdata and Hadoop

In this article, we will understand the very basic question which the beginners in the field of Big Data have. That is What is the difference between Big Data and Apache Hadoop.                  1. Introduction The difference between Big Data and Apache Hadoop is distinct and quite fundamental. But most of the people ...

Read More »

Apache Hadoop RecordReader Example

In this example,we will have a look at and understand the concept of RecordReader component of Apache Hadoop. But before digging into the example code, we would like look at the theory behind the InputStream and RecordReader to better understand the concept.                   1. Introduction To better understand RecordReader, we have to ...

Read More »

Hadoop Sequence File Example

In the article we will have a look at Hadoop Sequence file format. Hadoop Sequence Files are one of the Apache Hadoop specific file formats which stores data in serialized key-value pair. We have look into details of Hadoop Sequence File in the subsequent sections. 1. Introduction Apache Hadoop supports text files which are quite commonly used for storing the ...

Read More »

The Best Hadoop Analytics Solutions

Data Analytics using Hadoop is one of the most important requirement in businesses today due to the amount of data being generated and the value the businesses can generate from this data. We will look into some of the best Hadoop Analytics Solutions available in the market which can be used for data analysis.             ...

Read More »

How Does Hadoop Work

Apache Hadoop is an open source software used for distributed computing that can process large amount of data and get the results faster using reliable and scalable architecture. Apache Hadoop runs on top of a commodity hardware cluster consisting of multiple systems which can range from couple of systems to thousands of systems. This cluster and involvement of multiple systems ...

Read More »

The Hadoop Ecosystem Explained

In this article, we will go through the Hadoop Ecosystem and will see of what it consists and what does the different projects are able to do. 1. Introduction Apache Hadoop is an open source platform managed by Apache Foundation. It is written in Java and is able to process large amount of data (generally called Big Data) in distributed ...

Read More »

Big Data Hadoop Tutorial for Beginners

This tutorial is for the beginners who want to start learning about Big Data and Apache Hadoop Ecosystem. This tutorial gives the introduction of different concepts of Big Data and Apache Hadoop which will set the base foundation for further learning. Table Of Contents 1. Introduction 2. Big Data? 2.1 Examples of Big Data. 3. Characteristics of Big Data 3.1 ...

Read More »

Prerequisites for Learning Hadoop

In this article, we will dig deep to understand what are the prerequisites of learning and working with Hadoop. We will see what are the required things and what are the industry standard suggested things to know before you start learning Hadoop                   1. Introduction Apache Hadoop is the entry point or ...

Read More »