Apache Solr in Java: Using Apache SolrJ
In this example, we are going to show you how to use Apache SolrJ to index data in Solr and query from Solr.
1. Introduction
Apache Solr is a popular open-source search platform built on Apache Lucene. If we take a 30000-foot view, Solr is a web application and HTTP is the fundamental protocol used between client applications and Solr. The client sends a request to Solr and Solr does some work and returns a response.
Other than HTTP API, SolrJ offers a Java API which encapsulates much of the work of sending requests and parsing responses. It is highly configurable and makes it much easier for applications written in Java to talk to Solr.
Table Of Contents
2. Technologies Used
The steps and commands described in this example are for Apache Solr 8.5 on Windows 10. The JDK version we use to run the SolrCloud in this example is OpenJDK 13. Before we start, please make sure your computer meet the system requirements. Also, please download the binary release of Apache Solr 8.5. Apache Maven 3.6.3 is used as the build system.
3. Using Apache SolrJ
3.1 Basics
SolrJ provides a few simple interfaces for us to connect to and communicate with Solr. The most important one is the SolrClient
which sends requests in the form of SolrRequests
and returns responses as SolrResponses
. There are several SolrClient implementations and we list some commonly used ones in the table below:
Client | Description |
---|---|
HttpSolrClient | A general-purpose SolrClient implementation that talks directly to a single Solr server via HTTP. It is better suited for query-centric workloads. |
LBHttpSolrClient | A load balancing wrapper around HttpSolrClient . Do NOT use it in Master/Slave scenarios. |
CloudSolrClient | A SolrClient implementation that talks to SolrCloud. It communicates with Zookeeper to discover Solr endpoints for SolrCloud collections, and then uses the LBHttpSolrClient to issue requests. |
ConcurrentUpdateSolrClient | A thread-safe SolrClient implementation which buffers all added documents and writes them into open HTTP connections. It is better suited for indexing-centric workloads. |
Before we jump into the SolrJ coding part, we need to get a couple of things ready by following the steps in sections below.
3.2 Adding dependencies
The SolrJ API ships with Solr, so a simple way to add SolrJ dependencies when running your java application is to add solr-solrj-8.5.2.jar
and its dependencies to classpath as below:
java -cp .:$SOLR_HOME/dist/solrj-lib/*:$SOLR_HOME/dist/solr-solrj-8.5.2.jar ...
To manage dependencies easily in this example, we use Apache Maven 3.6.3 as the build system. The following dependency declaration needs to be put in pom.xml
:
<dependency> <groupId>org.apache.solr</groupId> <artifactId>solr-solrj</artifactId> <version>8.5.2</version> </dependency>
3.3 Starting Solr Instance
For simplicity, instead of setting up a SolrCloud on your local machine as demonstrated in Apache Solr Clustering Example, we run a single Solr instance on our local machine. Before starting, You can simply download jcg_example_configs.zip attached to this article and extract it to the directory ${solr.install.dir}\server\solr\configsets\jcg_example_configs\conf. It contains all configurations and schema definitions required by this example. Then run the command below to start the Solr instance:
bin\solr.cmd start
The output would be:
D:\Java\solr-8.5.2>bin\solr.cmd start Waiting up to 30 to see Solr running on port 8983 Started Solr server on port 8983. Happy searching!
In addition, we need to create a new core named jcg_example_core
with the jcg_example_configs
configSet on the local machine. For example, we can do it via the CoreAdmin API:
curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=CREATE --data-urlencode name=jcg_example_core --data-urlencode configSet=jcg_example_configs
The output would be:
D:\Java\solr-8.5.2>curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=CREATE --data-urlencode name=jcg_example_core --data-urlencode configSet=jcg_example_configs { "responseHeader":{ "status":0, "QTime":641}, "core":"jcg_example_core"}
If the jcg_example_core
has already existed, you can remove it via the CoreAdmin API as below and start over again:
curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=UNLOAD --data-urlencode core=jcg_example_core --data-urlencode deleteInstanceDir=true
The output would be:
D:\Java\solr-8.5.2>curl -G http://localhost:8983/solr/admin/cores --data-urlencode action=UNLOAD --data-urlencode core=jcg_example_core --data-urlencode deleteInstanceDir=true { "responseHeader":{ "status":0, "QTime":37}}
3.4 Indexing Using SolrJ
3.4.1 Building a SolrClient
First of all, we need to build a SolrClient
instance. SolrClient
implementations provide builders with fluence interfaces which are very easy to use. Also, this is a good place to configure SolrClient
parameters such as Solr base URL, timeouts, etc. The static method below builds a HttpSolrClient
connecting to the Solr instance running on localhost with 5 seconds connection timeout and 3 seconds read timeout.
Note that we define a static SolrClient
instance in this example to reuse it everywhere instead of building a new one every time for performance consideration.
/** * The Solr instance URL running on localhost */ private static final String SOLR_CORE_URL = "http://localhost:8983/solr/jcg_example_core"; /** * The static solrClient instance. */ private static final SolrClient solrClient = getSolrClient(); /** * Configures SolrClient parameters and returns a SolrClient instance. * * @return a SolrClient instance */ private static SolrClient getSolrClient() { return new HttpSolrClient.Builder(SOLR_CORE_URL).withConnectionTimeout(5000).withSocketTimeout(3000).build(); }
3.4.2 Indexing Articles by Using SolrInputDocument
SolrClient
provides a straightforward API to add documents to be indexed. org.apache.solr.common.SolrInputDocument
class is used which represents the field-value information needed to construct and index a Lucene Document. The field values should match those specified in managed-schema.xml
. In the method below, a list of SolrInputDocument
are created from a list of sample articles. Fields are explicitly added to each document.
Note that many SolrClient
implementations have drastically slower indexing performance when documents are added individually. So in the method below, document batching is used by sending a collection of documents to Solr and then commit them for indexing. This generally leads to better indexing performance and should be used whenever possible.
/** * Indexing articles by using SolrInputDocument. */ public void indexingByUsingSolrInputDocument() { // create a list of SolrInputDocument List<SolrInputDocument> docs = new ArrayList<SolrInputDocument>(); for (Article article : getArticles()) { final SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", article.getId()); doc.addField("category", article.getCategory()); doc.addField("title", article.getTitle()); doc.addField("author", article.getAuthor()); doc.addField("published", article.isPublished()); docs.add(doc); } System.out.printf("Indexing %d articles...\n", docs.size()); try { // send the documents to Solr solrClient.add(docs); // explicit commit pending documents for indexing solrClient.commit(); System.out.printf("%d articles indexed.\n", docs.size()); } catch (SolrServerException | IOException e) { System.err.printf("\nFailed to indexing articles: %s", e.getMessage()); } }
3.4.3 Indexing Articles by Using Java Object Binding
To remember all the fields and add them one by one might be an unpleasant experience and error-prone. SolrJ let us work with
domain objects directly by implicitly converting documents to and from any class that has been specially marked with @Field
annotation.
The fields of the Article
class below are annotated with @Field
annotations. An annotated field is mapped to a corresponding Solr field. The variable name will be used as the field name in Solr by default. However, this can be overridden by providing the annotation with an explicit field name.
/** * The article POJO. */ class Article { @Field private String id; @Field private String category; @Field private String title; @Field private String author; @Field private boolean published; // constructors // getters and setters }
Then in the method below, we can simply send a list of articles to Solr for indexing without worrying about the field mapping.
/** * Indexing articles by using Java object binding. */ public void indexingByUsingJavaObjectBinding() { try { List<Article> articles = getArticles(); System.out.printf("Indexing %d articles...\n", articles.size()); // send articles to Solr solrClient.addBeans(articles); // explicit commit pending documents for indexing solrClient.commit(); System.out.printf("%d articles indexed.\n", articles.size()); } catch (SolrServerException | IOException e) { System.err.printf("\nFailed to indexing articles: %s", e.getMessage()); } }
3.5 Querying Using SolrJ
SolrClient
has several query()
methods accepting SolrParams
which allow us to send a search request to Solr instance. SolrParams
is designed to hold parameters to Solr and basically it is a MultiMap of String keys to one or more String values. In the method below, we use a MapSolrParams
instance to hold query parameters and search articles written by Kevin Yang. Once the response is returned, we print the search results to the standard output.
/** * Querying articles by using SolrParams. */ public void queryingByUsingSolrParams() { // constructs a MapSolrParams instance final Map<String, String> queryParamMap = new HashMap<String, String>(); queryParamMap.put("q", "author:Kevin"); // search articles written by Kevin Yang queryParamMap.put("fl", "id, title, author"); queryParamMap.put("sort", "id asc"); MapSolrParams queryParams = new MapSolrParams(queryParamMap); // sends search request and gets the response QueryResponse response = null; try { response = solrClient.query(queryParams); } catch (SolrServerException | IOException e) { System.err.printf("Failed to search articles: %s", e.getMessage()); } // print results to stdout if (response != null) { printResults(response.getResults()); } }
SolrQuery
, a subclass of SolrParams
, provides several convenient methods to set the query parameters as shown in the following method:
/** * Querying articles by using SolrQuery (a subclass of SolrParams). */ public void queryingByUsingSolrQuery() { // constructs a SolrQuery instance final SolrQuery solrQuery = new SolrQuery("author:Kevin"); solrQuery.addField("id"); solrQuery.addField("title"); solrQuery.addField("author"); solrQuery.setSort("id", ORDER.asc); solrQuery.setRows(10); // sends search request and gets the response QueryResponse response = null; try { response = solrClient.query(solrQuery); } catch (SolrServerException | IOException e) { System.err.printf("Failed to search articles: %s", e.getMessage()); } // print results to stdout if (response != null) { printResults(response.getResults()); } }
Similar to using Java object binding when indexing, we can directly convert search results into domain objects as shown in the method below:
/** * Querying articles by using SolrQuery and converting results into domain * objects with Java object binding. */ public void queryingByUsingSolrQueryAndJavaObjectBinding() { // constructs a SolrQuery instance final SolrQuery solrQuery = new SolrQuery("author:Kevin"); solrQuery.addField("id"); solrQuery.addField("title"); solrQuery.addField("author"); solrQuery.setSort("id", ORDER.asc); solrQuery.setRows(10); // sends search request and gets the response QueryResponse response = null; try { response = solrClient.query(solrQuery); } catch (SolrServerException | IOException e) { System.err.printf("Failed to search articles: %s", e.getMessage()); } // converts to domain objects and prints to standard output if (response != null) { List<Article> articles = response.getBeans(Article.class); for (Article article : articles) { System.out.println(article.toString()); } } }
3.6 Running the Example
Assuming you have already had the Solr instance running locally, we can run the example and verify the results. Download the example source code and run the following command to run the SolrJExample
:
mvn clean compile exec:exec
In case your Solr instance is not running, you will see the following error messages in the output:
======== SolrJ Example ======== Indexing 12 articles... Failed to indexing articles: Server refused connection at: http://localhost:8983/solr/jcg_example_core Failed to search articles: Server refused connection at: http://localhost:8983/solr/jcg_example_core
If everything is working fine, you should be able to see the output as below:
======== SolrJ Example ======== Indexing 12 articles... 12 articles indexed. Querying by using SolrParams... Found 6 documents id=0221234283, title=Java ArrayList 101, author=Kevin Yang id=0553573333, title=Java Array Example, author=Kevin Yang id=055357342Y, title=Java StringTokenizer Example, author=Kevin Yang id=0553579908, title=Java Remote Method Invocation Example, author=Kevin Yang id=0626166238, title=Java Arrays Showcases, author=Kevin Yang id=0818231712, title=Apache SolrCloud Example, author=Kevin Yang Querying by using SolrQuery... Found 6 documents id=0221234283, title=Java ArrayList 101, author=Kevin Yang id=0553573333, title=Java Array Example, author=Kevin Yang id=055357342Y, title=Java StringTokenizer Example, author=Kevin Yang id=0553579908, title=Java Remote Method Invocation Example, author=Kevin Yang id=0626166238, title=Java Arrays Showcases, author=Kevin Yang id=0818231712, title=Apache SolrCloud Example, author=Kevin Yang Querying by using SolrQuery and Java object binding... Found 6 articles Article [id=0221234283, title=Java ArrayList 101, author=Kevin Yang] Article [id=0553573333, title=Java Array Example, author=Kevin Yang] Article [id=055357342Y, title=Java StringTokenizer Example, author=Kevin Yang] Article [id=0553579908, title=Java Remote Method Invocation Example, author=Kevin Yang] Article [id=0626166238, title=Java Arrays Showcases, author=Kevin Yang] Article [id=0818231712, title=Apache SolrCloud Example, author=Kevin Yang]
4. Download the Source Code
You can download the full source code of this example here: Apache Solr in Java: Using Apache SolrJ