This article depicts how to implement Lucene.net in Asp.net
web application. Here I am creating a sample application for creating Lucene
index documents from SQL server database and to search and retrieve data from the
index documents. And here I also mention how we can search with multiple terms
with multiple lucene.net document fields.
What is Lucene.net?
Lucene.net is a high performance Information Retrieval (IR)
library or a search engine library. Lucene.Net contains powerful APIs for
creating full text indexes and implementing advanced and precise search
technologies into programs. Lucene.net is not a ready to use application like a
web search or a file search application, but it's a framework library.
Architecture of
Lucene.Net
The diagram shows process of creating indexes, searching and
retrieving data from lucene index documents.
Figure 1: Lucene architecture
Main parts of
Lucene.Net
Store Directory -
Directory is the place to store the index.
Analyzer – The
Analyzer is responsible for breaking the text down into single words or terms.
IndexWriter – The
IndexWriter takes on the responsibility of coordinating the Analyzer and
throwing the results to the Directory for storage.
Document – The
Document is what gets indexed by the IndexWriter. You can think of a Document
as an entity that you want to retrieve.
Field – The
Document contains a list of Fields that is used to describe the document. Every
field has a name and a value. Each of the field’s values contains the text that
you want to make searchable.
IndexSearcher –
The IndexSearcher is doing the actual search.
Search Term – A
Term is the most basic construct for searching. A Term consists of two parts,
the name of a field you wish to search, and the value of the field.
Search Query -
Using the term the Query works with the IndexSearcher to provide the results.
Hits – This
represents a list of documents that were returned in the search. A Hits object
can be iterated over, and is responsible for getting the documents from the
search.
Follow the few steps to create the
lucene.net indexes and make a search:
Step: 1 Creating a
database.
Here I am creating a database named dbLucene and three
tables in it they are Category, Designation and Person
Category
Figure 2: Category table
Designation
Figure 3: Designation table
Person
Figure 4: Person table
And insert few records to the tables which is need to create
index documents.
Step: 2 Create a web
application.
Create a web application with two pages Index.aspx and
Search.aspx
The index page should contain one button which is for
creating Lucene.Net index documents from database.
In search page add three controls - a textbox to enter
search keyword, a button to fire the search function and a grid view to display
the search result.
Step: 3 Add
Lucene.Net reference
You can download the Lucene.Net dll form the bellow link
Add the
reference…
Figure 5: Adding lucene reference to the project
And import the namespaces to the code
using Directory =
Lucene.Net.Store.Directory;
Using Version =
Lucene.Net.Util.Version;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Index;
using Lucene.Net.Documents;
using Lucene.Net.Analysis;
using System.Diagnostics;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using Lucene.Net.Store;
Step: 4 Creating
lucene.net index documents
The following code demonstrates
how to create the lucene index when you call CreatePersonsIndex method:
// The query fetch all person details
public DataSet GetPersons()
{
String sqlQuery = @"SELECT
dbo.Person.FirstName, dbo.Person.LastName, dbo.Designation.DesigName,
dbo.Category.CategoryName, dbo.Person.Address
FROM dbo.Designation RIGHT
OUTER JOIN
dbo.Person ON
dbo.Designation.DesignationId = dbo.Person.DesignationId LEFT OUTER JOIN
dbo.Category ON
dbo.Person.CategoryId = dbo.Category.CategoryId";
return GetDataSet(sqlQuery);
}
// Returns the dataset
public DataSet GetDataSet(string
sqlQuery)
{
DataSet
ds = new DataSet();
SqlConnection sqlCon = new
SqlConnection("Data
Source=datasource;Database=dbLucene;User Id=user;Password=password");
SqlCommand sqlCmd = new SqlCommand();
sqlCmd.Connection
= sqlCon;
sqlCmd.CommandType = CommandType.Text;
sqlCmd.CommandText = sqlQuery;
SqlDataAdapter sqlAdap = new
SqlDataAdapter(sqlCmd);
sqlAdap.Fill(ds);
return ds;
}
// Creates the lucene.net index with
person details
public void
CreatePersonsIndex(DataSet ds)
{
//Specify the index file location where the indexes are to
be stored
string indexFileLocation = @"D:\Lucene.Net\Data\Persons";
Lucene.Net.Store.Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory(indexFileLocation,
true);
IndexWriter indexWriter = new
IndexWriter(dir, new StandardAnalyzer(), true);
indexWriter.SetRAMBufferSizeMB(10.0);
indexWriter.SetUseCompoundFile(false);
indexWriter.SetMaxMergeDocs(10000);
indexWriter.SetMergeFactor(100);
if (ds.Tables[0] != null)
{
DataTable dt = ds.Tables[0];
if (dt.Rows.Count > 0)
{
foreach (DataRow dr in
dt.Rows)
{
//Create the Document object
Document
doc = new Document();
foreach (DataColumn dc in
dt.Columns)
{
//Populate the document with the column name
and value from our query
doc.Add(new Field(dc.ColumnName,
dr[dc.ColumnName].ToString(), Field.Store.YES, Field.Index.TOKENIZED));
}
// Write the Document to the catalog
indexWriter.AddDocument(doc);
}
}
}
// Close the writer
indexWriter.Close();
}
protected void
btnCreateIndex_Click(object sender, EventArgs e)
{
CreatePersonsIndex(GetPersons());
}
Now the lucene.net index documents will be created in index
file location. If you open the folder
you can see the index documents.
Step: 5 Search and
get the hits from the indexed documents
We can implement our search functionality with the following
code, enter the search text in the text box and firing the
search event, the lucene.net will start searching in the indexed lucene
documents using the index searcher and will return hits if found any matching
records based on the boolean query.
Here I implemented the multi-field searching; the entered
keyword will be searched in multiple fields of the lucene index document.
The code bellow
demonstrates the functionality, copy the methods and call the
SearchPersons method on buttonSearch click.
public void
SearchPersons(string searchString)
{
// Results are collected as a List
List<SearchResults>
Searchresults = new List<SearchResults>();
// Specify the location where the index files are stored
string indexFileLocation = @"D:\Lucene.Net\Data\Persons";
Lucene.Net.Store.Directory dir =
Lucene.Net.Store.FSDirectory.GetDirectory(indexFileLocation);
// specify the search fields, lucene search in multiple
fields
string[] searchfields = new
string[] { "FirstName",
"LastName", "DesigName", "CategoryName"
};
IndexSearcher indexSearcher = new IndexSearcher(dir);
// Making a boolean query for searching and get the searched
hits
var hits =
indexSearcher.Search(QueryMaker(searchString, searchfields));
for (int i = 0; i
< hits.Length(); i++)
{
SearchResults result = new
SearchResults();
result.FirstName = hits.Doc(i).GetField("FirstName").StringValue();
result.LastName = hits.Doc(i).GetField("LastName").StringValue();
result.DesigName = hits.Doc(i).GetField("DesigName").StringValue();
result.Address = hits.Doc(i).GetField("Address").StringValue();
result.CategoryName = hits.Doc(i).GetField("CategoryName").StringValue();
Searchresults.Add(result);
}
indexSearcher.Close();
GridView1.DataSource = Searchresults;
GridView1.DataBind();
}
// Making the query
public BooleanQuery
QueryMaker(string searchString, string[] searchfields)
{
var parser = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_29, searchfields, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
var finalQuery = new BooleanQuery();
string searchText;
searchText = searchString.Replace("+",
"");
searchText = searchText.Replace("\"",
"");
searchText = searchText.Replace("\'",
"");
//Split the search string into separate search terms by
word
string[] terms = searchText.Split(new[] { " "
}, StringSplitOptions.RemoveEmptyEntries);
foreach (string term in terms)
{
if (searchString.Contains("+"))
{
finalQuery.Add(parser.Parse(term.Replace("*",
"") + "*"),
BooleanClause.Occur.MUST);
}
else
{
finalQuery.Add(parser.Parse(term.Replace("*",
"") + "*"),
BooleanClause.Occur.SHOULD);
}
}
return finalQuery;
}
// Creating an object to store the searched data
public class SearchResults
{
public string
FirstName { get; set;
}
public string
LastName { get; set;
}
public string
DesigName { get; set;
}
public string Address
{ get; set; }
public string
CategoryName { get; set;
}
}
// Calling
the search function on button click
protected void btnSearch_Click(object
sender, EventArgs e)
{
SearchPersons(TextBox1.Text);
}
Summary
Searching data
using Lucene.Net provides a nice, faster data retrieval mechanism in your
application. Once you've used the Lucene.Net you can understand the
features and flexibility of Lucene.net in our search process.
Hopefully the
above introduction and code samples have helped whet your appetite to learn
more.
Hope this helps,
Sony.
Very clearly explained. Thanks for sharing this!
ReplyDeleteThank You Ravinder...
DeleteThis was very instructive as to how to build indexes directly from MS SQL tables.
ReplyDeleteHowever some of the syntax is no longer supported on versions later than LUCENE_29, example:
Lucene.Net.Store.Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory(indexFileLocation);
Causes error (not just deprecated) on LUCENE_30.
Had to figure out alternative in very badly documented environment.
But thanks. It really helped.
wow its working thanks for sharing this
ReplyDeleteThanks. It was very helpful.
ReplyDeleteVery helpful....
ReplyDeleteThanks...
please send the output
ReplyDeleteThanks for sharing
ReplyDeleteWill you please share updating lucene.net index documents periodically
Casino News - jtmhub.com
ReplyDeleteCasino News - jtmhub.com Casino News · Casino News · 여수 출장마사지 New Casino News · Free 경상남도 출장마사지 Casino News · 시흥 출장샵 Poker 성남 출장안마 News · New Poker News · Poker News · 양주 출장안마 The Best Poker News.