Apache Solr

Solr Dismax Example

In this example of Solr Dismax, we will discuss about how to use Dismax query to provide better search experience to the user. We will show you how to use boost factor and boost query parameters provided by Solr to obtain the desired results.

To demonstrate the Solr Dismax usage, we will install Solr and start the solr with one of the pre-configured core techproducts which ships along with Solr. Our preferred environment for this example is solr-5.3.0. Before you begin the Solr installation make sure you have JDK installed and Java_Home is set appropriately.
 
 
 
 

1. Install Apache Solr

To begin with, lets download the latest version of Apache Solr from the following location:

http://lucene.apache.org/solr/downloads.html

Apache Solr has gone through various changes from 4.x.x to 5.0.0, so if you have a different version of Solr you need to download the 5.x.x. version to follow this example.

Once the Solr zip file is downloaded, unzip it into a folder. The extracted folder will look like the below:

solr_folder
Solr folders

The bin folder contains the scripts to start and stop the server. The example folder contains few example files. We will be using one of them to demonstrate how Solr indexes the data. The server folder contains the logs folder where all the Solr logs are written. It will be helpful to check the logs for any error during indexing. The solr folder under server holds different collection or core. The configuration and data for each of the core/ collection are stored in the respective core/ collection folder.

Apache Solr comes with an inbuilt Jetty server. But before we start the solr instance we must validate the JAVA_HOME is set on the machine.

2. Start Apache Solr

We can start the server using the command line script. Lets go to the bin directory from the command prompt and issue the following command:

solr start -e techproducts

This will start the Solr server under the default port 8983.

We can now open the following URL in the browser and validate that our Solr instance is running. The specifics of solr admin tool is beyond the scope of the example. You can see the example documents are indexed and stored in the Solr.

http://localhost:8983/solr/#/techproducts

Solr_dismax_techproducts
Solr Admin Console

3. Dismax Query parser

A Dismax query is nothing but a union of documents produced by the sub-queries and scores each document produced by the sub-query. In general, the DisMax query parser’s interface is more like that of Google than the interface of the standard Solr request handler. This similarity makes DisMax the appropriate query parser for many consumer applications.

The commonly used query parameters are:

  • q – Defines the raw input strings for the query.
  • qf – Query Fields: specifies the fields in the index on which to perform the query. If absent, defaults to df.
  • bq – Boost Query: specifies a factor by which a term or phrase should be “boosted” in importance when
    considering a match.

Now open the following URL in the browser. The Dismax query will search for the term video in the documents and order the result based on the score. We have selected only the name of the product from various document and score for the document.

http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&fl=name,score

Solr_dismax_output
Solr Dismax output

4. Using Query fields with boost factor

The product ASUS Extreme N7800GTX/2DHTV (256 MB) obtained the same score as the ATI Radeon X1900 XTX 512 MB PCIE Video Card. Note, even though the search term video was present in the name field of the Video card it didn’t get higher score. Dismax provides option to boost the score based on specific search fields and the numeric value assigned to it.

Open the following URL in the browser. The query will boost the score of the documents with search term present in the name field. The result will promote the Video card to get higher score and present the document atop of other results.

http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&fl=name,score&qf=name^1.0+features^0.3

Solr_dismax_boost_factor
Solr boost factor

5. Using Boost query parameter

The Boost query or bq parameter specifies an additional, optional query clause that will be added to the user’s main query to influence the score. In continue with the above query with boost factor we will add one more boost with products having category(cat) as graphics card.

Open the following URL in the browser. The query will boost the score of the graphics card products. You can check the result set to see the impact of the parameter.

http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&fl=name,score&qf=name^1.0+features^0.3&bq=cat:graphics%20card^5.0

Solr_dismax_boost_query
Dismax Boost Query

Veeramani Kalyanasundaram

Veera is a Software Architect working in telecom domain with rich experience in Java Middleware Technologies. He is a OOAD practitioner and interested in Performance Engineering.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button