Apache Solr

Apache Solr Multilingual Search: Field-per-language Example

This is an article related to the Apache Solr Multilingual Search: Field-per-language.

1. Introduction

Apache Solr is an open-source software java search engine. It is scalable and can process a high volume of data. It is used to index the content and search a huge amount of content.

2.1 Prerequisites

Java 7 or 8 is required on the Linux, windows, or Mac operating system. Apache Solr 8.8.2 is required for this example.

2.2 Download

You can download Java 8 can be downloaded from the Oracle website. Apache Solr’s latest releases are available from the Apache Solr website.

2.3 Setup

You can set the environment variables for JAVA_HOME and PATH. They can be set as shown below:

Setup

JAVA_HOME="/desktop/jdk1.8.0_73"
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH

2.4 How to download and install Apache Solr

Apache Solr’s latest releases are available from the Apache Solr website. After downloading the zip file can be extracted to a folder.

To start the Apache Solr, you can use the command below:

Solr start command

bin/solr start

The output of the above command is shown below:

Solr start command output

apples-MacBook-Air:solr-8.8.2 bhagvan.kommadi$ bin/solr start
*** [WARN] *** Your open file limit is currently 2560.  
 It should be set to 65000 to avoid operational disruption. 
 If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
*** [WARN] ***  Your Max Processes Limit is currently 1392. 
 It should be set to 65000 to avoid operational disruption. 
 If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
Waiting up to 180 seconds to see Solr running on port 8983 [-]  
Started Solr server on port 8983 (pid=3054). Happy searching!

You can access the Solr application from the browse at : http://localhost:8983/solr/. The screenshot below shows the Solr application.

Solr Field per language - solr app
Solr Application

2.5 Apache Solr

Apache Solr merged into Lucene around 2010. Lucene was created by Doug Cutting in 1999. Solr was developed by Yonik Seeley at CNET. Solr had a cloud feature released in 4.0. Solr 6.0 supported parallel SQL queries. Solr is based on Lucene. It has REST API support. It has an inverted index feature to get documents for a query using the search word. The search word is entered by the user to link the documents to the word. Solr has the features such as support for XML/JSON/HTTP, recommendations, automatic load balancing, spell suggestions, autocompletion, geospatial search, authentication, authorization, multilingual keyword search, type ahead prediction, batch processing, streaming, machine learning models, high volume web traffic support, schema, schemaless configuration, faceted search, filtering, and cluster configuration

2.6 Apache Solr – Multilingual example

To handle multiple languages, the field per language approach can be used in Apache Solr. A field can have its own analyzer to handle multiple languages. Language is identified from the input. The language field is used for identification. Solr supports different languages. Some languages have their stemming filters and tokenizers.

Field per language approach is specifying content in separate fields per language. The keywords are duplicated for each field. The other approaches are single-core and multi-core. This approach needs to be avoided as the complexity of the query time is in the order of n (number of languages) factor. Let us look at an example with Solr fields for email:

Solr fields for email

<?xml version="1.0" encoding="UTF-8" ?>
<schema version="1.5">
  <fields>
    <field name="id" type="string" indexed="true" stored="true" required="true"/>
    <field name="address_from" type="email" indexed="true" stored="true" required="true"/>
    <dynamicField name="address_*" type="email" multiValued="true" indexed="true" stored="true" />
    <field name="subject" type="text_general" indexed="true" stored="true" />
    <field name="date" type="string" indexed="true" stored="true" required="true" />
    <field name="message" type="text_general" indexed="true" stored="true" />
    <field name="priority" type="int" indexed="true" stored="true" />
    <field name="text" type="text_general" multiValued="true" indexed="true" stored="false" />
    <copyField source="message" dest="text" />
    <copyField source="subject" dest="text" />
    <copyField source="address_*" dest="text" />
    <field name="facet_contacts" type="address_noemail" multiValued="true" indexed="true" stored="false" />
    <field name="facet_emails"   type="address_email"   multiValued="true" indexed="true" stored="false" />
    <copyField source="address_*" dest="facet_contacts" />
    <copyField source="address_*" dest="facet_emails" />
    <field name="language" type="string" stored="true" indexed="true" />
    <dynamicField name="*_en" type="text_en" stored="true" indexed="true" />
    <dynamicField name="*_ru" type="text_ru" stored="true" indexed="true" />
  </fields>
  <uniqueKey>id</uniqueKey>
  <types>
    <fieldType name="string" class="solr.StrField" />
    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
    <fieldType name="email" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.UAX29URLEmailTokenizerFactory"/>
        <filter class="solr.ICUFoldingFilterFactory"/>
      </analyzer>
    </fieldType>
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.ICUFoldingFilterFactory"/>
      </analyzer>
    </fieldType>
    <fieldType name="address_noemail" class="solr.TextField">
      <analyzer type="index">
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern=" <.*" replacement="" />
        <tokenizer class="solr.KeywordTokenizerFactory" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory" />
      </analyzer>
    </fieldType>
    <fieldType name="address_email" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.UAX29URLEmailTokenizerFactory"/>
        <filter class="solr.TypeTokenFilterFactory" types="filter_email.txt" enablePositionIncrements="true" useWhitelist="true"/>
      </analyzer>
    </fieldType>

    <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt" enablePositionIncrements="true" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPossessiveFilterFactory"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
    </fieldType>
 
    <fieldType name="text_ru" class="solr.TextField" positionIncrementGap="100">
      <analyzer> 
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_ru.txt" format="snowball" enablePositionIncrements="true"/>
        <filter class="solr.RussianLightStemFilterFactory"/>
      </analyzer>
    </fieldType>
  </types>
</schema>

Language identifier is used for identifying the text language and putting them in the right fields. Field per language is preferred because it is simple. It has built-in support for searching through DisMax style query parser. This approach works in a single-core and sharded setup of Solr.

Because of the ease of setup and limited complexity of most multilingual search implementations, this approach is the most commonly used and best-supported way of handling multilingual search in Solr. You can implement this by copying the directory example within solr installation and rename it as fieldperlanguage. Ensure that it is a deep copy of the directory. Within the directory, remove the unused directories like example-DIH, multicore,example-schemaless and others. Remove the directories under the collection1 folder. Remove the files under the collection1 folder. You can copy the folders and files from the source code ($SOURCE_CODE) provided to collection1. You can execute the commands below from the fieldperlanguage directory

Solr start output

cd $SOLR_INSTALL
cp -R example fieldperlanguage
rm -r example-DIH
rm -r multicore
cd $SOLR_INSTALL/fieldperlanguage
cd solr/collection1
rm -r *
cp * $SOURCE_CODE/* .

Note that the core.properties is changed the name from collection to fieldperlanguage directory. You can restart Solr from fieldperlanguage folder by using the command below:

Solr start command

cd $SOLR_INSTALL/fieldperlanguage
java -jar start.jar

The output of the command is shown below:

Solr start output

apples-MacBook-Air:langauages bhagvan.kommadi$ java -jar start.jar 
0    [main] INFO  org.eclipse.jetty.server.Server  – jetty-8.1.10.v20130312
35   [main] INFO  org.eclipse.jetty.deploy.providers.ScanningAppProvider  – Deployment monitor /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/contexts at interval 0
48   [main] INFO  org.eclipse.jetty.deploy.DeploymentManager  – Deployable added: /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/contexts/solr-jetty-context.xml
1959 [main] INFO  org.eclipse.jetty.webapp.StandardDescriptorProcessor  – NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
2044 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  – SolrDispatchFilter.init()
2072 [main] INFO  org.apache.solr.core.SolrResourceLoader  – JNDI not configured for solr (NoInitialContextEx)
2073 [main] INFO  org.apache.solr.core.SolrResourceLoader  – solr home defaulted to 'solr/' (could not find system property or JNDI)
2076 [main] INFO  org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for directory: 'solr/'
2218 [main] INFO  org.apache.solr.core.ConfigSolr  – Loading container configuration from /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/solr.xml
2442 [main] INFO  org.apache.solr.core.CoresLocator  – Config-defined core root directory: /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr
2455 [main] INFO  org.apache.solr.core.CoreContainer  – New CoreContainer 1720339
2456 [main] INFO  org.apache.solr.core.CoreContainer  – Loading cores into CoreContainer [instanceDir=solr/]
2470 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting socketTimeout to: 0
2471 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting urlScheme to: null
2477 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting connTimeout to: 0
2477 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting maxConnectionsPerHost to: 20
2480 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting corePoolSize to: 0
2481 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting maximumPoolSize to: 2147483647
2481 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting maxThreadIdleTime to: 5
2482 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting sizeOfQueue to: -1
2482 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting fair
2761 [main] INFO  org.apache.solr.logging.LogWatcher  – SLF4J impl is org.slf4j.impl.Log4jLoggerFactory
2762 [main] INFO  org.apache.solr.logging.LogWatcher  – Registering Log Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]
2764 [main] INFO  org.apache.solr.core.CoreContainer  – Host Name: 
3046 [main] INFO  org.apache.solr.core.CoresLocator  – Looking for core definitions underneath /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr
3063 [main] INFO  org.apache.solr.core.CoresLocator  – Found core collection1 in /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/
3064 [main] INFO  org.apache.solr.core.CoresLocator  – Found 1 core definitions
3069 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.CoreContainer  – Creating SolrCore 'collection1' using instanceDir: /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1
3070 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for directory: '/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/'
3113 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrConfig  – Adding specified lib dirs to ClassLoader
3115 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lucene-libs/lucene-analyzers-icu-4.7.0.jar' to classloader
3115 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lucene-libs/lucene-analyzers-stempel-4.7.0.jar' to classloader
3116 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lucene-libs/lucene-analyzers-morfologik-4.7.0.jar' to classloader
3116 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lucene-libs/lucene-analyzers-smartcn-4.7.0.jar' to classloader
3118 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lib/icu4j-52.1.jar' to classloader
3118 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lib/morfologik-fsa-1.7.1.jar' to classloader
3119 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lib/morfologik-stemming-1.7.1.jar' to classloader
3119 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/analysis-extras/lib/morfologik-polish-1.7.1.jar' to classloader
3122 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/dist/solr-langid-4.7.0.jar' to classloader
3126 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/langid/lib/jsonic-1.2.7.jar' to classloader
3127 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  – Adding 'file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/contrib/langid/lib/langdetect-1.1-20120112.jar' to classloader
3196 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrConfig  – Using Lucene MatchVersion: LUCENE_43
3333 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.Config  – Loaded SolrConfig: solrconfig.xml
3341 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.schema.IndexSchema  – Reading Solr Schema from schema.xml
3358 [coreLoadExecutor-4-thread-1] WARN  org.apache.solr.schema.IndexSchema  – [collection1] schema has no name!
3455 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.schema.IndexSchema  – unique key field: id
3509 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – solr.NRTCachingDirectoryFactory
3515 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – [collection1] Opening new SolrCore at /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/, dataDir=/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data/
3516 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – JMX monitoring not detected for core: collection1
3537 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.CachingDirectoryFactory  – return new directory for /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data
3537 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – New index directory detected: old=null new=/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data/index/
3539 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.CachingDirectoryFactory  – return new directory for /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data/index
3650 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.update.processor.UpdateRequestProcessorChain  – creating updateRequestProcessorChain "languages"
4266 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.update.processor.UpdateRequestProcessorChain  – inserting DistributedUpdateProcessorFactory into updateRequestProcessorChain "languages"
4266 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – no updateRequestProcessorChain defined as default, creating implicit default
4275 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.RequestHandlers  – created /select: solr.SearchHandler
4276 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.RequestHandlers  – created /selectEN: solr.SearchHandler
4276 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.RequestHandlers  – created /selectRU: solr.SearchHandler
4279 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.RequestHandlers  – created /update: solr.UpdateRequestHandler
4295 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.RequestHandlers  – created /admin: solr.admin.AdminHandlers
4295 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.RequestHandlers  – adding lazy requestHandler: solr.FieldAnalysisRequestHandler
4296 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.RequestHandlers  – created /analysis/field: solr.FieldAnalysisRequestHandler
4319 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.handler.loader.XMLLoader  – xsltCacheLifetimeSeconds=60
4334 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – Hard AutoCommit: disabled
4335 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – Soft AutoCommit: disabled
4427 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – SolrDeletionPolicy.onInit: commits: num=1
	commit{dir=NRTCachingDirectory(NIOFSDirectory@/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data/index lockFactory=NativeFSLockFactory@/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data/index; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2,generation=2}
4427 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrCore  – newest commit generation = 2
4491 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.search.SolrIndexSearcher  – Opening Searcher@1560571d[collection1] main
4505 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.CoreContainer  – registering core: collection1
4506 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  – [collection1] Registered new searcher Searcher@1560571d[collection1] main{StandardDirectoryReader(segments_2:3:nrt _0(4.3):C4)}
4508 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  – user.dir=/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages
4508 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  – SolrDispatchFilter.init() done
4550 [main] INFO  org.eclipse.jetty.server.AbstractConnector  – Started SocketConnector@0.0.0.0:8983
13069 [qtp519821334-17] INFO  org.apache.solr.servlet.SolrDispatchFilter  – [admin] webapp=null path=/admin/cores params={indexInfo=false&wt=json&_=1623443841576} status=0 QTime=35 
13614 [qtp519821334-17] INFO  org.apache.solr.servlet.SolrDispatchFilter  – [admin] webapp=null path=/admin/info/system params={wt=json&_=1623443841897} status=0 QTime=277 
17854 [qtp519821334-17] ERROR org.apache.solr.handler.admin.ShowFileRequestHandler  – Can not find: admin-extra.menu-top.html [/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/conf/admin-extra.menu-top.html]
17857 [qtp519821334-17] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.menu-top.html&contentType=text/html;charset%3Dutf-8&_=1623443845994} status=404 QTime=8 
17875 [qtp519821334-17] ERROR org.apache.solr.handler.admin.ShowFileRequestHandler  – Can not find: admin-extra.menu-bottom.html [/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/conf/admin-extra.menu-bottom.html]
17875 [qtp519821334-17] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.menu-bottom.html&contentType=text/html;charset%3Dutf-8&_=1623443845996} status=404 QTime=0 
18158 [qtp519821334-17] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1623443846611} status=0 QTime=15 
18205 [qtp519821334-12] ERROR org.apache.solr.handler.admin.ShowFileRequestHandler  – Can not find: admin-extra.html [/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/conf/admin-extra.html]
18206 [qtp519821334-12] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1623443846645} status=404 QTime=1 
18353 [qtp519821334-17] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/system params={wt=json&_=1623443846642} status=0 QTime=177 
27181 [qtp519821334-17] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/select params={q=*:*&indent=true&wt=json&_=1623443855651} hits=4 status=0 QTime=88 
43723 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={wt=json&_=1623443872159} status=0 QTime=8 
47709 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={wt=json&_=1623443876274} status=0 QTime=1 
47732 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={file=stopwords_en.txt&contentType=text/plain;charset%3Dutf-8&_=1623443876281} status=0 QTime=2 
51864 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={wt=json&_=1623443880406} status=0 QTime=2 
51891 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={file=stopwords_ru.txt&contentType=text/plain;charset%3Dutf-8&_=1623443880408} status=0 QTime=1 
59490 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={wt=json&_=1623443888005} status=0 QTime=3 
59498 [qtp519821334-13] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={file=schema.xml&contentType=text/xml;charset%3Dutf-8&_=1623443888013} status=0 QTime=1 
67797 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=schema&wt=json&_=1623443896086} status=0 QTime=33 
71034 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/mbeans params={cat=QUERYHANDLER&wt=json&_=1623443899514} status=0 QTime=37 
80459 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={wt=json&_=1623443908965} status=0 QTime=1 
89776 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/select params={q=*:*&indent=true&wt=csv&_=1623443918270} hits=4 status=0 QTime=10 
111571 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/mbeans params={cat=QUERYHANDLER&wt=json&_=1623443939994} status=0 QTime=1 
113453 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=schema&wt=json&_=1623443942007} status=0 QTime=3 
115534 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/mbeans params={cat=QUERYHANDLER&wt=json&_=1623443944046} status=0 QTime=0 
118591 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={wt=json&_=1623443947155} status=0 QTime=3 
120797 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1623443949267} status=0 QTime=55 
133255 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={numTerms=0&wt=json&_=1623443961636} status=0 QTime=87 
133298 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=schema&wt=json&_=1623443961860} status=0 QTime=1 
159007 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=schema&wt=json&_=1623443987516} status=0 QTime=2 
160607 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/mbeans params={cat=QUERYHANDLER&wt=json&_=1623443989105} status=0 QTime=3 
162292 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file params={wt=json&_=1623443990846} status=0 QTime=2 
163490 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1623443992014} status=0 QTime=6 
169511 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={numTerms=0&wt=json&_=1623443997965} status=0 QTime=18 
169579 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=schema&wt=json&_=1623443998139} status=0 QTime=0 
184421 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=schema&wt=json&_=1623444012932} status=0 QTime=10 
194511 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=schema&wt=json&_=1623444023012} status=0 QTime=1 
194657 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/analysis/field params={analysis.showmatch=true&analysis.fieldname=addr_from&wt=json&_=1623444023177} status=0 QTime=41 
203383 [qtp519821334-13] ERROR org.apache.solr.handler.admin.ShowFileRequestHandler  – Can not find: admin-extra.html [/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/conf/admin-extra.html]
203400 [qtp519821334-13] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1623444031910} status=404 QTime=17 
203400 [qtp519821334-20] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1623444031900} status=0 QTime=38 
203592 [qtp519821334-12] INFO  org.apache.solr.core.SolrCore  – [collection1] webapp=/solr path=/admin/system params={wt=json&_=1623444031908} status=0 QTime=223 
^C211775 [Thread-0] INFO  org.eclipse.jetty.server.Server  – Graceful shutdown SocketConnector@0.0.0.0:8983
211778 [Thread-0] INFO  org.eclipse.jetty.server.Server  – Graceful shutdown o.e.j.w.WebAppContext{/solr,file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr-webapp/webapp/},/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/webapps/solr.war
212786 [Thread-0] INFO  org.apache.solr.core.CoreContainer  – Shutting down CoreContainer instance=1720339
212788 [Thread-0] INFO  org.apache.solr.core.SolrCore  – [collection1]  CLOSING SolrCore org.apache.solr.core.SolrCore@72c8e7b
212790 [Thread-0] INFO  org.apache.solr.update.UpdateHandler  – closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
212791 [Thread-0] INFO  org.apache.solr.update.SolrCoreState  – Closing SolrCoreState
212791 [Thread-0] INFO  org.apache.solr.update.DefaultSolrCoreState  – SolrCoreState ref count has reached 0 - closing IndexWriter
212792 [Thread-0] INFO  org.apache.solr.update.DefaultSolrCoreState  – closing IndexWriter with IndexWriterCloser
212813 [Thread-0] INFO  org.apache.solr.core.SolrCore  – [collection1] Closing main searcher on request.
212816 [Thread-0] INFO  org.apache.solr.core.CachingDirectoryFactory  – Closing NRTCachingDirectoryFactory - 2 directories currently being tracked
212819 [Thread-0] INFO  org.apache.solr.core.CachingDirectoryFactory  – looking to close /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data/index [CachedDir<>]
212819 [Thread-0] INFO  org.apache.solr.core.CachingDirectoryFactory  – Closing directory: /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data/index
212820 [Thread-0] INFO  org.apache.solr.core.CachingDirectoryFactory  – looking to close /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data [CachedDir<>]
212820 [Thread-0] INFO  org.apache.solr.core.CachingDirectoryFactory  – Closing directory: /Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr/collection1/data
212831 [Thread-0] INFO  org.eclipse.jetty.server.handler.ContextHandler  – stopped o.e.j.w.WebAppContext{/solr,file:/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/solr-webapp/webapp/},/Users/bhagvan.kommadi/Desktop/solr-4.7.0/langauages/webapps/solr.war
apples-MacBook-Air:langauages bhagvan.kommadi$

You can launch the browser pointing to http://localhost:8983/solr/#/fieldperlanguage/. You can upload the CSV from the browser by clicking on the Documents and upload the emailvalues.csv as CSV. Similarly, you can upload the emails.xml as XML Document Type. The screenshot below shows the query results.

Solr Field per language - query results
Query results

You can query from the browser for Russian language results using the raw query q=language:"ru". The results are shown below in the screenshot.

Solr Field per language - for russian
Language results for russian

You can also query for English language emails using q = language:"en". The results are shown in the screenshot below:

Language results for english

3. Download the Source Code

Download
You can download the full source code of this example here: Apache Solr Multilingual Search: Field-per-language Example

Bhagvan Kommadi

Bhagvan Kommadi is the Founder of Architect Corner & has around 20 years’ experience in the industry, ranging from large scale enterprise development to helping incubate software product start-ups. He has done Masters in Industrial Systems Engineering at Georgia Institute of Technology (1997) and Bachelors in Aerospace Engineering from Indian Institute of Technology, Madras (1993). He is member of IFX forum,Oracle JCP and participant in Java Community Process. He founded Quantica Computacao, the first quantum computing startup in India. Markets and Markets have positioned Quantica Computacao in ‘Emerging Companies’ section of Quantum Computing quadrants. Bhagvan has engineered and developed simulators and tools in the area of quantum technology using IBM Q, Microsoft Q# and Google QScript. He has reviewed the Manning book titled : "Machine Learning with TensorFlow”. He is also the author of Packt Publishing book - "Hands-On Data Structures and Algorithms with Go".He is member of IFX forum,Oracle JCP and participant in Java Community Process. He is member of the MIT Technology Review Global Panel.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button