Skip to main content

Apache Solr Search Framework

Introduction:

  • SOLR stands for Searching On Lucene with Replication. Its main functionalities are indexing and searching.
  • Solr is an open source enterprise search server/ web application Created by Yonik Seeley in 2004.
  • In 2012 4.0 version with Cloud support.
  • Solr uses lucene search library and extends it.
  • Solr exposes lucene Java API's as RESTful services.

Terminology:


Solr Instance:
In Apache Solr, a Solr Instance is an instance a Solr running in the JVM. In Standalone mode, Solr contains only one instance.

Solr Core:
In Apache Solr, a Solr Core is also known as simply “Core”. In other words, a Solr Core = an instance of Apache Lucene Index + Solr Configuratio.

Indexing:
In Apache Lucene or Solr, Indexing is a technique of adding Document’s content to Solr Index so that we can search them easily. Apache Solr uses Apache Lucene Inverted Index technique to Index it’s documents. That’s why Solr provides very fast searching feature.

Document:
Solr's basic unit of info is "document" which is a set of data that describe something.
Documents composed of fields which are more specific pieces of info.
Fields can contain different kinds of data. A name field e.g. name is a text.
<doc>
                <field name="id">12345</field>
                <field name="name">Perficient, Nagpur</field>
                <field name="org">Perficient India Private Limited</field>
</doc>.

Field:
In Apache Solr, a Field is actual data stored in a Document. It is a key & value pair. Key indicates the field name and value contains that Field data.
One Document can contain one or more Fields. Apache Solr uses this Field data to index the Document Content.
   

Why Apache Solr?

  • Advanced Full-Text Search capabilities.
  • Optimized for high volume web traffic.
  • Standards based Open interfaces - XML,JSON,CSV etc.
  • Providing Suggestion(Auto complete).
  • Given support for More like this.
  • Support for Pagination

What To DO with Solr?

       What you need to do with Solr, you just need to put documents as input data in Solr
                via xml, json or csv.
       Query it via HTTP GET request.
       Get index data in xml, json or CSV format.

Request Handler:

  1. /admin
  2.  /select
  3. /spell

Key difference between lucene and Solr:

       Lucene is API library used to create search indexes to get read and write documents.
       To implement this functionality, we need to interact with the library.
       Solr is having RESTful web services to interact with those libraries.

Solr Configuration:

Schema.xml
It is usually the first file you configure when setting up a new Solr installation.
The schema declares:
           what kinds of fields there are
           which field should be used as the unique/primary key
           which fields are required
           how to index and search each field
Solrconfig.xml
It is usually the second file you configure when setting up a new Solr installation, after schema.xml.
Used for configuring indexing and search components.
The more commonly-used elements in solrconfig.xml are:
           data directory location
           cache parameters
           request handlers
           search components
core. properties
Used for defining properties at core level.
       The core. properties file is a simple Java Properties file where each line is just a key=value pair, e.g., name=core1. Notice that no quotes are required.
       A minimal core. properties file looks like the example below. However, it can also be empty, see information on placement of core. properties below.
       name=my_core_name
Solr.xml
       Used for core data structure level e.g. field type etc.
       You can find solr.xml in your $SOLR_HOME directory (usually server/solr) in standalone mode.


Implementation of Apache Solr:

  • You need to create core by using command “solr create -c <core-name>

  • Now the core didn’t have any data to index. 
  • So, we need to give data as input in the form of XML, JSON etc. I am giving it in xml.
  • Navigate to "F:\Installers\solr-7.3.1\example\exampledocs"
  • Type java -Dc=hellosolr -jar post.jar *.xml   Enter
  • Get the indexed data by querying using Admin UI.


Thank You…!!!

Comments