Apache Hadoop

Hadoop Hbase Maven Example

In this article, we will learn about using Maven for including Hbase in your Apache Hadoop related applications and how Maven makes it easy to write the Java Hbase applications with the repositories.
 
 
 
 
 
 
 
 
 
 

1. Introduction

Hbase is the NoSql database available in the Hadoop Ecosystem. Like rest of the Hadoop Ecosystem Hbase is also open-source and is used when the database capabilities are needed to store a lot of big data on top of HDFS. It is written in Java and is based on Google’s BigTable which means it is distributed in nature and also provides fault-tolerant capabilities.

Maven is a software project management and comprehension tool which enables developers to build software without worrying about manually downloading the dependencies for the project.

In this example article, we will go through the process of creating an Hbase project for Hadoop using Maven.

2. Setting Up Maven Hbase Project

To create an Hbase java application, there are two ways, either to download the Hbase client library and include it in the CLASSPATH or the second way is to use Maven to manage the dependencies. As we discussed before we will look into the second way of handling the dependencies.

The very first step in to create a Hbase Maven project using the following command:

mvn archetype:create -DgroupId=com.javacodegeeks.examples -DartifactId=maven-hbase-example

Alternatively, you can use your favorite IDE to create a Maven project. I use IdeaIDE for my projects and following is the setup in IdeaIDE.

    1. Go to the IDE and create new project.
    2. Select project type to be Maven as shown in the screenshot and click next.

      Choose Maven Project while creating new project
      Choose Maven Project while creating new project
    3. Next we will enter the groupId and the artifactId for the project. Let up put groupId to be com.javacodegeeks.examples and artifactId to be maven-hbase-example

      GroupId and ArtifactId
      GroupId and ArtifactId
    4. Select the name of the project folder in the this step. We will use the same name as the artifactId i.e. maven-hbase-example

      Select the name of the project
      Select the name of the project
    5. Now we are ready with a new Maven jave project where we can add Hbase as a dependency from the maven repository.

      The project pom.xml file
      The project pom.xml file

3. Setting up Maven POM

After we are done setting up the project, the first thing we need to do is to add the hbase-client maven dependency to the pom.xml file. The following is the basic pom.xml file:

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.javacodegeeks.examples</groupId>
    <artifactId>maven-hbase-example</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>1.2.4</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.0.2</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <addClasspath>true</addClasspath>
                            <classpathPrefix>lib/</classpathPrefix>
                            <mainClass>com.javacodegeeks.examples.MavenHbase</mainClass>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-dependency-plugin</artifactId>
                <executions>
                    <execution>
                        <id>copy</id>
                        <phase>install</phase>
                        <goals>
                            <goal>copy-dependencies</goal>
                        </goals>
                        <configuration>
                            <outputDirectory>${project.build.directory}/lib</outputDirectory>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

The POM file consists of few important parts which need to be mentioned:

      1. The most important part is the dependency of hbase in the pom file which makes sure that the hbase-client library is available to be used in the code.
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>1.2.4</version>
        </dependency>
        
      2. Next are the maven plugins which are requried for creating the java packages. maven-jar-plugin defines the manifest properties of the resultant jar. For example, in out example com.javacodegeeks.examples.MavenHbase is mentioned as the class containing main() method of the java project which need to be executed when the jar is executed. Following is the plugin which defines the jar manifest properties:
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-jar-plugin</artifactId>
            <configuration>
                <archive>
                    <manifest>
                        <addClasspath>true</addClasspath>
                        <classpathPrefix>lib/</classpathPrefix>
                        <mainClass>com.javacodegeeks.examples.MavenHbase</mainClass>
                    </manifest>
                </archive>
            </configuration>
        </plugin>
        
      3. Next plugin is the maven-dependency-plugin which defines what to do with the dependencies during the different types of maven executions. For example, the following properties make sure that all the dependies are copied to the lib folder in the jar while executing the install maven command:
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-dependency-plugin</artifactId>
            <executions>
                <execution>
                    <id>copy</id>
                    <phase>install</phase>
                    <goals>
                        <goal>copy-dependencies</goal>
                    </goals>
                    <configuration>
                        <outputDirectory>${project.build.directory}/lib</outputDirectory>
                    </configuration>
                 </execution>
            </executions>
        </plugin>
        

4. Packaging the Project

Once we have the project finished and ready for deployment. We can package the jar file using the maven command:

mvn clean compile install

Building Maven Package
Building Maven Package

This will create the jar file with all the code and the dependencies included which is also called fat-jar due to the fact that in includes all the dependencies.

The package will be named with the name of the project followed by -1.0-SNAPSHOT.jar and it looks like as below:

jar file with dependencies included.
jar file with dependencies included.

The packages jar project can then be executing using the java command:

java -jar maven-hbase-example-1.0-SNAPSHOT.jar

We do not need to pass the main() path in the java command as we have already included it in the pom file as a manifest.

5. Summary

In this example article, we discussed the example setup for setting up Hbase project using Maven repositories and dependencies. We covered the pom.xml file which is the most important aspect of using the Maven. At the end, we saw how to build the maven package with fat-jar and how to execute the java package.

Raman Jhajj

Ramaninder has graduated from the Department of Computer Science and Mathematics of Georg-August University, Germany and currently works with a Big Data Research Center in Austria. He holds M.Sc in Applied Computer Science with specialization in Applied Systems Engineering and minor in Business Informatics. He is also a Microsoft Certified Processional with more than 5 years of experience in Java, C#, Web development and related technologies. Currently, his main interests are in Big Data Ecosystem including batch and stream processing systems, Machine Learning and Web Applications.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
fairymaiden
fairymaiden
5 years ago

SUPPOSE USE: generate.
typo to use: create can’t find
mvn archetype:create -DgroupId=com.javacodegeeks.examples -DartifactId=maven-hbase-example

mvn archetype: genreate -DgroupId=com.javacodegeeks.examples -DartifactId=maven-hbase-example

Back to top button