Skip to main content

Apache Solr Search Framework

Introduction:

  • SOLR stands for Searching On Lucene with Replication. Its main functionalities are indexing and searching.
  • Solr is an open source enterprise search server/ web application Created by Yonik Seeley in 2004.
  • In 2012 4.0 version with Cloud support.
  • Solr uses lucene search library and extends it.
  • Solr exposes lucene Java API's as RESTful services.

Terminology:


Solr Instance:
In Apache Solr, a Solr Instance is an instance a Solr running in the JVM. In Standalone mode, Solr contains only one instance.

Solr Core:
In Apache Solr, a Solr Core is also known as simply “Core”. In other words, a Solr Core = an instance of Apache Lucene Index + Solr Configuratio.

Indexing:
In Apache Lucene or Solr, Indexing is a technique of adding Document’s content to Solr Index so that we can search them easily. Apache Solr uses Apache Lucene Inverted Index technique to Index it’s documents. That’s why Solr provides very fast searching feature.

Document:
Solr's basic unit of info is "document" which is a set of data that describe something.
Documents composed of fields which are more specific pieces of info.
Fields can contain different kinds of data. A name field e.g. name is a text.
<doc>
                <field name="id">12345</field>
                <field name="name">Perficient, Nagpur</field>
                <field name="org">Perficient India Private Limited</field>
</doc>.

Field:
In Apache Solr, a Field is actual data stored in a Document. It is a key & value pair. Key indicates the field name and value contains that Field data.
One Document can contain one or more Fields. Apache Solr uses this Field data to index the Document Content.
   

Why Apache Solr?

  • Advanced Full-Text Search capabilities.
  • Optimized for high volume web traffic.
  • Standards based Open interfaces - XML,JSON,CSV etc.
  • Providing Suggestion(Auto complete).
  • Given support for More like this.
  • Support for Pagination

What To DO with Solr?

       What you need to do with Solr, you just need to put documents as input data in Solr
                via xml, json or csv.
       Query it via HTTP GET request.
       Get index data in xml, json or CSV format.

Request Handler:

  1. /admin
  2.  /select
  3. /spell

Key difference between lucene and Solr:

       Lucene is API library used to create search indexes to get read and write documents.
       To implement this functionality, we need to interact with the library.
       Solr is having RESTful web services to interact with those libraries.

Solr Configuration:

Schema.xml
It is usually the first file you configure when setting up a new Solr installation.
The schema declares:
           what kinds of fields there are
           which field should be used as the unique/primary key
           which fields are required
           how to index and search each field
Solrconfig.xml
It is usually the second file you configure when setting up a new Solr installation, after schema.xml.
Used for configuring indexing and search components.
The more commonly-used elements in solrconfig.xml are:
           data directory location
           cache parameters
           request handlers
           search components
core. properties
Used for defining properties at core level.
       The core. properties file is a simple Java Properties file where each line is just a key=value pair, e.g., name=core1. Notice that no quotes are required.
       A minimal core. properties file looks like the example below. However, it can also be empty, see information on placement of core. properties below.
       name=my_core_name
Solr.xml
       Used for core data structure level e.g. field type etc.
       You can find solr.xml in your $SOLR_HOME directory (usually server/solr) in standalone mode.


Implementation of Apache Solr:

  • You need to create core by using command “solr create -c <core-name>

  • Now the core didn’t have any data to index. 
  • So, we need to give data as input in the form of XML, JSON etc. I am giving it in xml.
  • Navigate to "F:\Installers\solr-7.3.1\example\exampledocs"
  • Type java -Dc=hellosolr -jar post.jar *.xml   Enter
  • Get the indexed data by querying using Admin UI.


Thank You…!!!

Comments

Popular posts from this blog

Exception Propagation and throw-throws keyword

Exception Propagation: Sending exception from one method to another method is called exception propagation Here propagation is nothing but sending. So guys what is the situation when we are going to implement this propagation concept? Answer to this question is when we throw an exception, if it is not caught in that method it is propagated to another method and interesting thing is that we don’t need to do anything in this process we just need to throw an exception that’s it. Rule: If the propagated exception is checked exception then not only current method developer but also method’s caller method developer should caught or report that checked exception else it leads to same compiler exception….CE: Unreported Exception So according to that basic rule is “If checked exception is raised the method either directly by using throw keyword or indirectly by a method call it must be caught or reported. void m1() throws ClassNotFoundException{ ...

Restful API for Beginners

Introduction: Everyone is saying Rest API. Sometimes people say  only API to rest apis.RESTful API is for accessing blah blah service or to do blah blah functionality. In corporate world, fresher get confused when seniors give them task to create new api to call blah blah service or to do blah blah functionality. So what is this REST API? REST (REpresentational State Transfer) is an architectural style, and an approach to communications.Using this REST whatever functionality we create known as REST API. We can implement  rest-api using various providers but usually people use Jersey and Spring . As per my choice personally I would like to use jersey which keep code neat and clean. Rest follows client-server architecture with Front Controller Design Pattern. It completely depends on http protocol .REST implementation is very easy and run in  less memory compared to SOAP. Rest Support following parameter techniques to pass input for our web service...