Solr Dismax Example
In this example of Solr Dismax, we will discuss about how to use Dismax query to provide better search experience to the user. We will show you how to use boost factor and boost query parameters provided by Solr to obtain the desired results.
To demonstrate the Solr Dismax usage, we will install Solr and start the solr with one of the pre-configured core techproducts which ships along with Solr. Our preferred environment for this example is solr-5.3.0. Before you begin the Solr installation make sure you have JDK installed and Java_Home is set appropriately.
1. Install Apache Solr
To begin with, lets download the latest version of Apache Solr from the following location:
http://lucene.apache.org/solr/downloads.html
Apache Solr has gone through various changes from 4.x.x to 5.0.0, so if you have a different version of Solr you need to download the 5.x.x. version to follow this example.
Once the Solr zip file is downloaded, unzip it into a folder. The extracted folder will look like the below:
The bin
folder contains the scripts to start and stop the server. The example
folder contains few example files. We will be using one of them to demonstrate how Solr indexes the data. The server
folder contains the logs
folder where all the Solr logs are written. It will be helpful to check the logs for any error during indexing. The solr
folder under server holds different collection or core. The configuration and data for each of the core/ collection are stored in the respective core/ collection folder.
Apache Solr comes with an inbuilt Jetty server. But before we start the solr instance we must validate the JAVA_HOME is set on the machine.
2. Start Apache Solr
We can start the server using the command line script. Lets go to the bin directory from the command prompt and issue the following command:
solr start -e techproducts
This will start the Solr server under the default port 8983.
We can now open the following URL in the browser and validate that our Solr instance is running. The specifics of solr admin tool is beyond the scope of the example. You can see the example documents are indexed and stored in the Solr.
http://localhost:8983/solr/#/techproducts
3. Dismax Query parser
A Dismax query is nothing but a union of documents produced by the sub-queries and scores each document produced by the sub-query. In general, the DisMax query parser’s interface is more like that of Google than the interface of the standard Solr request handler. This similarity makes DisMax the appropriate query parser for many consumer applications.
The commonly used query parameters are:
- q – Defines the raw input strings for the query.
- qf – Query Fields: specifies the fields in the index on which to perform the query. If absent, defaults to df.
- bq – Boost Query: specifies a factor by which a term or phrase should be “boosted” in importance when
considering a match.
Now open the following URL in the browser. The Dismax query will search for the term video
in the documents and order the result based on the score. We have selected only the name of the product from various document and score for the document.
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&fl=name,score
4. Using Query fields with boost factor
The product ASUS Extreme N7800GTX/2DHTV (256 MB)
obtained the same score as the ATI Radeon X1900 XTX 512 MB PCIE Video Card
. Note, even though the search term video
was present in the name field of the Video card it didn’t get higher score. Dismax provides option to boost the score based on specific search fields and the numeric value assigned to it.
Open the following URL in the browser. The query will boost the score of the documents with search term present in the name field. The result will promote the Video card to get higher score and present the document atop of other results.
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&fl=name,score&qf=name^1.0+features^0.3
5. Using Boost query parameter
The Boost query or bq parameter specifies an additional, optional query clause that will be added to the user’s main query to influence the score. In continue with the above query with boost factor we will add one more boost with products having category(cat) as graphics card.
Open the following URL in the browser. The query will boost the score of the graphics card products. You can check the result set to see the impact of the parameter.
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&fl=name,score&qf=name^1.0+features^0.3&bq=cat:graphics%20card^5.0