Apache Solr

How to Install Solr on Ubuntu

In this example of “how to install Solr on Ubuntu” we will discuss about how to download and install Solr in Ubuntu operating system. Ubuntu desktop operating system powers millions of PCs and laptops around the world. So this example is dedicated to users who are on Ubuntu and want to install Solr on Ubuntu.

Along with Solr installation, we will also show you how to create a Solr core and index an example file shipped along with Solr. Our preferred environment for this example is Ubuntu 14.x and solr-5.x. Before you begin the Solr installation make sure you have JDK installed and Java_Home is set appropriately.
 
 

1. Install Apache Solr

To begin with, lets download the latest version of Apache Solr from the following location:

http://www.eu.apache.org/dist/lucene/solr/5.3.1/

File: solr-5.3.1.tgz

Once the file is downloaded, create a directory called solr under /opt and move the downloaded file. Now navigate to the directory /opt/solr and unzip the file using the following command.

sudo tar -xvf solr-5.3.1.tgz

The Solr commands has to be executed from the bin directory, so navigate to the following path.

/opt/solr/solr-5.3.1/bin

The extracted directory will look like the below.

solr_ubuntu_folder
Solr Ubuntu folders

The bin folder contains the scripts to start and stop the server. The example folder contains few example files. We will be using one of them to demonstrate how Solr indexes the data. The server folder contains the logs folder where all the Solr logs are written. It will be helpful to check the logs for any error during indexing. The solr folder under server holds different collection or core. The configuration and data for each of the core/ collection are stored in the respective core/ collection folder.

Apache Solr comes with an inbuilt Jetty server. But before we start the solr instance we must validate the JAVA_HOME is set on the machine.

Now use the following command to start the Solr server.

sudo ./solr start

This will start the Solr server under the default port 8983. We can now open the following URL in the browser and validate that our Solr instance is running.

http://localhost:8983/solr/#/

Solr Ubuntu Console
Solr Ubuntu Console

2. Configure Apache Solr

When the Solr server is started in Standalone mode, the configuration is called core and when it is started in SolrCloud mode, the configuration is called Collection. In this example we will discuss about the standalone server and core. We will park the SolrCloud discussion for later time.

First, we need to create a Core for indexing the data. The Solr create command has the following options:

  • -c <name> – Name of the core or collection to create (required).
  • -d <confdir> – The configuration directory, useful in the SolrCloud mode.
  • -n <configName> – The configuration name. This defaults to the same name as the core or collection.
  • -p <port> – Port of a local Solr instance to send the create command to; by default the script tries to detect the port by looking for running Solr instances.
  • -s <shards> – Number of shards to split a collection into, default is 1.
  • -rf <replicas> – Number of copies of each document in the collection. The default is 1.

In this example we will use the -c parameter for core name and -d parameter for the configuration directory. For all other parameters we make use of default settings.

Now navigate the solr-5.3.1/bin directory and issue the following command

sudo ./solr create -c jcg -d basic_configs

We can see the following output in the command window.

Setup new core instance directory:
/opt/solr/solr-5.3.1/server/solr/jcg
Creating new core 'jcg' using command:
http://localhost:8983/solr/admin/cores?action=CREATE&name=jcg&instanceDir=jcg

{
"responseHeader":{
"status":0,
"QTime":5862},
"core":"jcg"}

Now edit the schema.xml file in the /server/solr/jcg/conf folder and add the following contents after the uniqueKey element.

schema.xml

<uniqueKey>id</uniqueKey>
<!-- Fields added for books.csv load-->
<field name="cat" type="text_general" indexed="true" stored="true"/>
<field name="name" type="text_general" indexed="true" stored="true"/>
<field name="price" type="tdouble" indexed="true" stored="true"/>
<field name="inStock" type="boolean" indexed="true" stored="true"/>
<field name="author" type="text_general" indexed="true" stored="true"/>

Since we have modified the configuration, we have to stop and start the server. To do so, we need to issue the following command from bin directory through command line:

sudo ./solr stop -all

The server will be stopped now. Now to start the server issue the following command from bin directory through command line:

sudo ./solr start

3. Indexing the Data

Apache Solr comes with a Standalone Java program called the SimplePostTool. This program is packaged into JAR and available with the installation under the folder example/exampledocs.

Now we navigate to the /example/exampledocs folder in the command prompt and type the following command. You will see a bunch of options to use the tool.

java -jar post.jar -h

The usage format in general is as follows:

Usage: java [SystemProperties] -jar post.jar [-h|-] [<file|folder|url|arg>
[<file|folder|url|arg>...]]

As we said earlier, we will index the data present in the “books.csv” file shipped with Solr installation. We will navigate to the /example/exampledocs in the command prompt and issue the following command.

java -Dtype=text/csv -Durl=http://localhost:8983/solr/jcg/update -jar post.jar books.csv

The SystemProperties used here are:

  • -Dtype – the type of the data file.
  • -Durl – URL for the jcg core.
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/jcg/update using content-type text/csv...
POSTing file books.csv to [base]
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/jcg/update...
Time spent: 0:00:01.149

Now, the data from the example file is indexed and stored. Let’s open the following URL. We can see the number of documents matching the data count in the example file.

http://localhost:8983/solr/#/jcg

Solr Ubuntu data
Solr Ubuntu Data

4. Download the Schema file

Download
You can download the schema file used in this example here: schema.xml

Veeramani Kalyanasundaram

Veera is a Software Architect working in telecom domain with rich experience in Java Middleware Technologies. He is a OOAD practitioner and interested in Performance Engineering.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button