Friday 8 June 2007

An introduction to indexing

I'm writing this blog because there's no solid guide on indexing services with ASP.NET, so I'm going to pull together a bit of a guide on how to do it in the hope it helps someone else.

Setting up

Setting up the indexing service itself is fairly simple.

Follow this guide:

http://www.windowsnetworking.com/articles_tutorials/Making-Windows-Server-2003-Indexing-Service-Useful.html

That guide'll only give you information on creating a query form for ASP.

To create one with .NET I'll explain some sample code later on.

Indexing PDFs

If you're looking to index PDFs as well, download and install this on your server:

http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611

That'll automatically work after you stop and re-start the indexing service. It does take a while so be patient.

.NET Query Form

Basic implementation of .NET query forms is explained in the links below:

http://www.codeproject.com/aspnet/search.asp

and

http://idunno.org/articles/278.aspx

The .NET implementation of a form itself is dead easy. The problem is that most forms would require a search for "Any phrase" and "Exact match" as options.

The exact match bit is the tricky bit. Code's explained a bit below:

'Any Words
Select DocTitle,Filename,Size,PATH,URL, Rank, Characterization, Write from SCOPE('deep traversal of ""/documents""') where FREETEXT('" & strSearch & "')

'Search Exact Phrase
strSearch = """" + strSearch + """"
Select DocTitle,Filename,Size,PATH,URL, Rank, Characterization, Write from SCOPE('deep traversal of ""/documents""') where contains('" & strSearch & "')

The search phrase is double-quoted because the indexing service needs quotes around it to conduct an exact search. That took me quite a while to work out!

The standard contains statement seems to work for me, although I may come back and correct this if it turns out to not do quite the job I'd expected.

If anybody has any questions on this, leave a comment with your email address and I'll get back to you.