Apache Solr

Apache Solr in Java: Using Apache SolrJ

In this example, we are going to show you how to use Apache SolrJ to index data in Solr and query from Solr.

1. Introduction

Apache Solr is a popular open-source search platform built on Apache Lucene. If we take a 30000-foot view, Solr is a web application and HTTP is the fundamental protocol used between client applications and Solr. The client sends a request to Solr and Solr does some work and returns a response.

Other than HTTP API, SolrJ offers a Java API which encapsulates much of the work of sending requests and parsing responses. It is highly configurable and makes it much easier for applications written in Java to talk to Solr.

2. Technologies Used

The steps and commands described in this example are for Apache Solr 8.5 on Windows 10. The JDK version we use to run the SolrCloud in this example is OpenJDK 13. Before we start, please make sure your computer meet the system requirements. Also, please download the binary release of Apache Solr 8.5. Apache Maven 3.6.3 is used as the build system.

3. Using Apache SolrJ

3.1 Basics

SolrJ provides a few simple interfaces for us to connect to and communicate with Solr. The most important one is the SolrClient which sends requests in the form of SolrRequests and returns responses as SolrResponses. There are several SolrClient implementations and we list some commonly used ones in the table below:

ClientDescription
HttpSolrClientA general-purpose SolrClient implementation that talks directly to a single Solr server via HTTP. It is better suited for query-centric workloads.
LBHttpSolrClientA load balancing wrapper around HttpSolrClient. Do NOT use it in Master/Slave scenarios.
CloudSolrClientA SolrClient implementation that talks to SolrCloud. It communicates with Zookeeper to discover Solr endpoints for SolrCloud collections, and then uses the LBHttpSolrClient to issue requests.
ConcurrentUpdateSolrClientA thread-safe SolrClient implementation which buffers all added documents and writes them into open HTTP connections. It is better suited for indexing-centric workloads.
Table. 1. SolrClient Implementations

Before we jump into the SolrJ coding part, we need to get a couple of things ready by following the steps in sections below.

3.2 Adding dependencies

The SolrJ API ships with Solr, so a simple way to add SolrJ dependencies when running your java application is to add solr-solrj-8.5.2.jar and its dependencies to classpath as below:

java -cp .:$SOLR_HOME/dist/solrj-lib/*:$SOLR_HOME/dist/solr-solrj-8.5.2.jar ...

To manage dependencies easily in this example, we use Apache Maven 3.6.3 as the build system. The following dependency declaration needs to be put in pom.xml:

<dependency>
  <groupId>org.apache.solr</groupId>
  <artifactId>solr-solrj</artifactId>
  <version>8.5.2</version>
</dependency>

3.3 Starting Solr Instance

For simplicity, instead of setting up a SolrCloud on your local machine as demonstrated in Apache Solr Clustering Example, we run a single Solr instance on our local machine. Before starting, You can simply download jcg_example_configs.zip attached to this article and extract it to the directory ${solr.install.dir}\server\solr\configsets\jcg_example_configs\conf. It contains all configurations and schema definitions required by this example. Then run the command below to start the Solr instance:

bin\solr.cmd start

The output would be:

D:\Java\solr-8.5.2>bin\solr.cmd start
Waiting up to 30 to see Solr running on port 8983
Started Solr server on port 8983. Happy searching!

In addition, we need to create a new core named jcg_example_core with the jcg_example_configs configSet on the local machine. For example, we can do it via the CoreAdmin API:

curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=CREATE --data-urlencode name=jcg_example_core --data-urlencode configSet=jcg_example_configs

The output would be:

D:\Java\solr-8.5.2>curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=CREATE --data-urlencode name=jcg_example_core --data-urlencode configSet=jcg_example_configs
{
  "responseHeader":{
    "status":0,
    "QTime":641},
  "core":"jcg_example_core"}

If the jcg_example_core has already existed, you can remove it via the CoreAdmin API as below and start over again:

curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=UNLOAD --data-urlencode core=jcg_example_core --data-urlencode deleteInstanceDir=true

The output would be:

D:\Java\solr-8.5.2>curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=UNLOAD --data-urlencode core=jcg_example_core --data-urlencode deleteInstanceDir=true
{
  "responseHeader":{
    "status":0,
    "QTime":37}}

3.4 Indexing Using SolrJ

3.4.1 Building a SolrClient

First of all, we need to build a SolrClient instance. SolrClient implementations provide builders with fluence interfaces which are very easy to use. Also, this is a good place to configure SolrClient parameters such as Solr base URL, timeouts, etc. The static method below builds a HttpSolrClient connecting to the Solr instance running on localhost with 5 seconds connection timeout and 3 seconds read timeout.

Note that we define a static SolrClient instance in this example to reuse it everywhere instead of building a new one every time for performance consideration.

/**
 * The Solr instance URL running on localhost
 */
private static final String SOLR_CORE_URL = "http://localhost:8983/solr/jcg_example_core";

/**
 * The static solrClient instance.
 */
private static final SolrClient solrClient = getSolrClient();

/**
 * Configures SolrClient parameters and returns a SolrClient instance.
 * 
 * @return a SolrClient instance
 */
private static SolrClient getSolrClient() {
    return new HttpSolrClient.Builder(SOLR_CORE_URL).withConnectionTimeout(5000).withSocketTimeout(3000).build();
}

3.4.2 Indexing Articles by Using SolrInputDocument

SolrClient provides a straightforward API to add documents to be indexed. org.apache.solr.common.SolrInputDocument class is used which represents the field-value information needed to construct and index a Lucene Document. The field values should match those specified in managed-schema.xml. In the method below, a list of SolrInputDocument are created from a list of sample articles. Fields are explicitly added to each document.

Note that many SolrClient implementations have drastically slower indexing performance when documents are added individually. So in the method below, document batching is used by sending a collection of documents to Solr and then commit them for indexing. This generally leads to better indexing performance and should be used whenever possible.

/**
 * Indexing articles by using SolrInputDocument.
 */
public void indexingByUsingSolrInputDocument() {
    // create a list of SolrInputDocument
    List<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
    for (Article article : getArticles()) {
        final SolrInputDocument doc = new SolrInputDocument();
        doc.addField("id", article.getId());
        doc.addField("category", article.getCategory());
        doc.addField("title", article.getTitle());
        doc.addField("author", article.getAuthor());
        doc.addField("published", article.isPublished());
        docs.add(doc);
    }

    System.out.printf("Indexing %d articles...\n", docs.size());

    try {
        // send the documents to Solr
        solrClient.add(docs);

        // explicit commit pending documents for indexing
        solrClient.commit();

        System.out.printf("%d articles indexed.\n", docs.size());
    } catch (SolrServerException | IOException e) {
        System.err.printf("\nFailed to indexing articles: %s", e.getMessage());
    }
}

3.4.3 Indexing Articles by Using Java Object Binding

To remember all the fields and add them one by one might be an unpleasant experience and error-prone. SolrJ let us work with
domain objects directly by implicitly converting documents to and from any class that has been specially marked with @Field annotation.

The fields of the Article class below are annotated with @Field annotations. An annotated field is mapped to a corresponding Solr field. The variable name will be used as the field name in Solr by default. However, this can be overridden by providing the annotation with an explicit field name.

/**
 * The article POJO.
 */
class Article {
    @Field
    private String id;

    @Field
    private String category;

    @Field
    private String title;

    @Field
    private String author;

    @Field
    private boolean published;

    // constructors
    // getters and setters
}

Then in the method below, we can simply send a list of articles to Solr for indexing without worrying about the field mapping.

/**
 * Indexing articles by using Java object binding.
 */
public void indexingByUsingJavaObjectBinding() {
    try {
        List<Article> articles = getArticles();
        System.out.printf("Indexing %d articles...\n", articles.size());
        // send articles to Solr
        solrClient.addBeans(articles);

        // explicit commit pending documents for indexing
        solrClient.commit();

        System.out.printf("%d articles indexed.\n", articles.size());
    } catch (SolrServerException | IOException e) {
        System.err.printf("\nFailed to indexing articles: %s", e.getMessage());
    }
}

3.5 Querying Using SolrJ

SolrClient has several query() methods accepting SolrParams which allow us to send a search request to Solr instance. SolrParams is designed to hold parameters to Solr and basically it is a MultiMap of String keys to one or more String values. In the method below, we use a MapSolrParams instance to hold query parameters and search articles written by Kevin Yang. Once the response is returned, we print the search results to the standard output.

/**
 * Querying articles by using SolrParams.
 */
public void queryingByUsingSolrParams() {
    // constructs a MapSolrParams instance
    final Map<String, String> queryParamMap = new HashMap<String, String>();
    queryParamMap.put("q", "author:Kevin"); // search articles written by Kevin Yang
    queryParamMap.put("fl", "id, title, author");
    queryParamMap.put("sort", "id asc");
    MapSolrParams queryParams = new MapSolrParams(queryParamMap);

    // sends search request and gets the response
    QueryResponse response = null;
    try {
        response = solrClient.query(queryParams);
    } catch (SolrServerException | IOException e) {
        System.err.printf("Failed to search articles: %s", e.getMessage());
    }

    // print results to stdout
    if (response != null) {
        printResults(response.getResults());
    }
}

SolrQuery, a subclass of SolrParams, provides several convenient methods to set the query parameters as shown in the following method:

/**
 * Querying articles by using SolrQuery (a subclass of SolrParams).
 */
public void queryingByUsingSolrQuery() {
    // constructs a SolrQuery instance
    final SolrQuery solrQuery = new SolrQuery("author:Kevin");
    solrQuery.addField("id");
    solrQuery.addField("title");
    solrQuery.addField("author");
    solrQuery.setSort("id", ORDER.asc);
    solrQuery.setRows(10);

    // sends search request and gets the response
    QueryResponse response = null;
    try {
        response = solrClient.query(solrQuery);
    } catch (SolrServerException | IOException e) {
        System.err.printf("Failed to search articles: %s", e.getMessage());
    }

    // print results to stdout
    if (response != null) {
        printResults(response.getResults());
    }
}

Similar to using Java object binding when indexing, we can directly convert search results into domain objects as shown in the method below:

/**
 * Querying articles by using SolrQuery and converting results into domain
 * objects with Java object binding.
 */
public void queryingByUsingSolrQueryAndJavaObjectBinding() {
    // constructs a SolrQuery instance
    final SolrQuery solrQuery = new SolrQuery("author:Kevin");
    solrQuery.addField("id");
    solrQuery.addField("title");
    solrQuery.addField("author");
    solrQuery.setSort("id", ORDER.asc);
    solrQuery.setRows(10);

    // sends search request and gets the response
    QueryResponse response = null;
    try {
        response = solrClient.query(solrQuery);
    } catch (SolrServerException | IOException e) {
        System.err.printf("Failed to search articles: %s", e.getMessage());
    }

    // converts to domain objects and prints to standard output
    if (response != null) {
        List<Article> articles = response.getBeans(Article.class);
        for (Article article : articles) {
            System.out.println(article.toString());
        }
    }
}

3.6 Running the Example

Assuming you have already had the Solr instance running locally, we can run the example and verify the results. Download the example source code and run the following command to run the SolrJExample:

mvn clean compile exec:exec

In case your Solr instance is not running, you will see the following error messages in the output:

======== SolrJ Example ========
Indexing 12 articles...

Failed to indexing articles: Server refused connection at: http://localhost:8983/solr/jcg_example_core
Failed to search articles: Server refused connection at: http://localhost:8983/solr/jcg_example_core

If everything is working fine, you should be able to see the output as below:

======== SolrJ Example ========
Indexing 12 articles...
12 articles indexed.
Querying by using SolrParams...
Found 6 documents
id=0221234283, title=Java ArrayList 101, author=Kevin Yang
id=0553573333, title=Java Array Example, author=Kevin Yang
id=055357342Y, title=Java StringTokenizer Example, author=Kevin Yang
id=0553579908, title=Java Remote Method Invocation Example, author=Kevin Yang
id=0626166238, title=Java Arrays Showcases, author=Kevin Yang
id=0818231712, title=Apache SolrCloud Example, author=Kevin Yang
Querying by using SolrQuery...
Found 6 documents
id=0221234283, title=Java ArrayList 101, author=Kevin Yang
id=0553573333, title=Java Array Example, author=Kevin Yang
id=055357342Y, title=Java StringTokenizer Example, author=Kevin Yang
id=0553579908, title=Java Remote Method Invocation Example, author=Kevin Yang
id=0626166238, title=Java Arrays Showcases, author=Kevin Yang
id=0818231712, title=Apache SolrCloud Example, author=Kevin Yang
Querying by using SolrQuery and Java object binding...
Found 6 articles
Article [id=0221234283, title=Java ArrayList 101, author=Kevin Yang]
Article [id=0553573333, title=Java Array Example, author=Kevin Yang]
Article [id=055357342Y, title=Java StringTokenizer Example, author=Kevin Yang]
Article [id=0553579908, title=Java Remote Method Invocation Example, author=Kevin Yang]
Article [id=0626166238, title=Java Arrays Showcases, author=Kevin Yang]
Article [id=0818231712, title=Apache SolrCloud Example, author=Kevin Yang]

4. Download the Source Code

Download
You can download the full source code of this example here: Apache Solr in Java: Using Apache SolrJ

Kevin Yang

A software design and development professional with seventeen years’ experience in the IT industry, especially with Java EE and .NET, I have worked for software companies, scientific research institutes and websites.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button