Home » Core Java » apache » lucene » Apache Lucene Hello World Example

About Sripriya Venkatesan

Sripriya is a Computer Science engineering graduate, she topped her graduation class and was a gold medalist. She has about 15 yrs of work experience, currently working as a technical architect/ technical manager for large scale enterprise applications, mainly around Java and database technologies; spanning different clients, geographies and domains. She has traveled to multiple countries and strives for work life balance. She is passionate about programming, design, architecture and enjoys working on new technologies.

Apache Lucene Hello World Example

 1. Introduction

In this example, I would like to show you how to get started with Apache Lucene and write a simple Hello World program. Apache Lucene is an open source tool that provides full text searching and indexing features. Apache Lucene site has excellent details of the features and examples. However, the examples on the site are very detailed. Here, I present a simple example to get started with this cool technology.

This example uses the below technologies, frameworks and IDE:

a. JDK 1.8
b. Apache Lucene 6.5.1
c. Eclipse Neon (You can use any IDE of your choice, or run it via command line)

2. Getting started

Let’s get going by first getting the relevant jars. At the time of writing this article, the latest apache lucene jar version is 6.5.1 that can be downloaded from the apache site. Extract the downloaded file and get the main jars (lucene-core-6.5.1.jar, lucene-queryparser-6.5.1.jar, lucene-analyzers-common-6.5.1.jar).

Tip
You may use any IDE of your choice or run code via command line.

Next, create a new eclipse project (I named it JCG).

Apache Lucene Hello World - New Eclipse Project

New Eclipse Project

Choose a name for the project and save.

Apache Lucene Hello World - Save Project

Save Project

Add the downloaded jars in the project build path. Even though for this example we only need lucene-core and lucene-queryparser jars, it is recommended to add all three jars for lucene projects.

Apache Lucene Hello World - Adding Lucene jars in classpath

Adding Lucene jars in classpath

3. What the code needs to accomplish

We start by building a simple index using IndexWriter class that builds and maintains an index, create a couple of document objects and add them to the IndexWriter instance. For the purpose of illustrating the functionality, we are using RAMDirectory to create the IndexWriter. Please note that RAMDirectory is a memory-resident Directory implementation that may not work very well with big indexes. However, it works well to illustrate the Directory functionality needed for our program.

Once the documents have been added and indexed, we will use IndexReader to access the index and IndexSearcher to search the index by using a query that searches on the index created. QueryParser instance is created with the content to be searched for. Query instance fetched to get the TopDocs value that in turn gives the number of hits.

Java code listed below performs the search and lists the number of hits. Search on a value in the index should return number of hits in the index, while search for any text not indexed should return 0.

3.1 Java Code

Let’s look at the code now.

LuceneHelloWorld.java

package com.javacodegeeks.lucene;

import java.io.IOException;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

public class LuceneHelloWorld {

 public static void main(String[] args) throws IOException, ParseException {
 //New index
 StandardAnalyzer standardAnalyzer = new StandardAnalyzer();
 Directory directory = new RAMDirectory();
 IndexWriterConfig config = new IndexWriterConfig(standardAnalyzer); 
 //Create a writer
 IndexWriter writer = new IndexWriter(directory, config);
 Document document = new Document ();
 //In a real world example, content would be the actual content that needs to be indexed.
 //Setting content to Hello World as an example.
 document.add(new TextField("content", "Hello World", Field.Store.YES));
 writer.addDocument(document);
 document.add(new TextField("content", "Hello people", Field.Store.YES));
 writer.addDocument(document); 
 writer.close();
 
 //Now let's try to search for Hello
 IndexReader reader = DirectoryReader.open(directory);
 IndexSearcher searcher = new IndexSearcher (reader);
 QueryParser parser = new QueryParser ("content", standardAnalyzer);
 Query query = parser.parse("Hello");
 TopDocs results = searcher.search(query, 5);
 System.out.println("Hits for Hello -->" + results.totalHits);

 //case insensitive search
 query = parser.parse("hello");
 results = searcher.search(query, 5);
 System.out.println("Hits for hello -->" + results.totalHits);

 //search for a value not indexed
 query = parser.parse("Hi there");
 results = searcher.search(query, 5);
 System.out.println("Hits for Hi there -->" + results.totalHits);
 }
}

3.2 Code output

The above code performs a query on the index using “Hello” and “hello” as search parameters – the search returns the total hits as expected. Searching on a value not present in the index e.g. "Hi there" returns 0 as the total hits as expected.

Hits for Hello -->2
Hits for hello ->2
Hits for Hi there -->0

3.3 Java code reading file contents and index output on a folder

We will now modify the code listed in section 3.1 to read from a file and index to a folder. Let’s look at the code:

LuceneHelloWorldReadFromFile.java

package com.javacodegeeks.lucene;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.nio.file.Paths;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

public class LuceneHelloWorldReadFromFile {

	public static void main(String[] args) throws IOException, ParseException {
		// New index
		StandardAnalyzer standardAnalyzer = new StandardAnalyzer();
		String inputFilePath = "C:\\priya\\workspace\\JCG\\src\\com\\javacodegeeks\\lucene\\input.txt";
		String outputDir = "C:\\priya\\workspace\\JCG\\src\\com\\javacodegeeks\\lucene\\output";
		File file = new File(inputFilePath);

		Directory directory = FSDirectory.open(Paths.get(outputDir));
		IndexWriterConfig config = new IndexWriterConfig(standardAnalyzer);
		config.setOpenMode(OpenMode.CREATE);
		// Create a writer
		IndexWriter writer = new IndexWriter(directory, config);

		Document document = new Document();
		try (BufferedReader br = new BufferedReader(new FileReader(inputFilePath))) {

			document.add(new TextField("content", br));
			writer.addDocument(document);
			writer.close();

		} catch (IOException e) {
			e.printStackTrace();
		}

		// Now let's try to search for Hello
		IndexReader reader = DirectoryReader.open(directory);
		IndexSearcher searcher = new IndexSearcher(reader);
		QueryParser parser = new QueryParser("content", standardAnalyzer);
		Query query = parser.parse("Hello");
		TopDocs results = searcher.search(query, 5);
		System.out.println("Hits for Hello -->" + results.totalHits);

		// case insensitive search
		query = parser.parse("hello");
		results = searcher.search(query, 5);
		System.out.println("Hits for hello -->" + results.totalHits);

		// search for a value not indexed
		query = parser.parse("Hi there");
		results = searcher.search(query, 5);
		System.out.println("Hits for Hi there -->" + results.totalHits);
	}
}

3.4 Code Output

In the code presented in section 3.3, the change we have made is to read contents to be indexed from a file input.txt and index to the outputDir directory:

Directory directory = FSDirectory.open(Paths.get(outputDir));
IndexWriterConfig config = new IndexWriterConfig(standardAnalyzer);
config.setOpenMode(OpenMode.CREATE);

		Document document = new Document();
		try (BufferedReader br = new BufferedReader(new FileReader(inputFilePath))) {

			document.add(new TextField("content", br));
			writer.addDocument(document);
			writer.close();

		} catch (IOException e) {
			e.printStackTrace();
		}

Also, the IndexWriter in this code creates index in the directory presented in attribute outputDir. You can view the indexing output by viewing the output folder. See a sample output below:

Apache Lucene Hello World - Indexed files

Indexed files

Sample input.txt and corresponding output of the java code in section 3.3 listed below:

input.txt-
Hello world

Output –

Hits for Hello -->1
Hits for hello -->1
Hits for Hi there -->0

4. Apache Lucene Hello World – Summary

In this example, we learnt how to get started with Lucene by getting the relevant jars, including jars in eclipse and running a Lucene Hello World programs – using two different approaches to indexing.

Hope you enjoyed this tutorial to get started with Lucene. This tutorial would serve as a starting point to get started with this rich open source technology. Enjoy and happy programming!

5. References

Some useful links are listed below for your reference:

6. Download the Eclipse Project

This was an Apache lucene Hello World example with Eclipse.

Download
You can download the full source code of this example here: lucene hello world

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

 

1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design

 

and many more ....

 

Receive Java & Developer job alerts in your Area

 

1
Leave a Reply

avatar
1 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
1 Comment authors
Simon Recent comment authors

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  Subscribe  
newest oldest most voted
Notify of
Simon
Guest

great work | thank you so much!