Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Adding full text indexing to your app is a hOOt!

Search. Everyone uses it all day, every day. The web's had it for forever in Net years and now we're growing to expect it just about everywhere else.

So you're a dev and you want to build in full text indexing and searching into your app. But you don't want to build in a dependency on something that has to be locally installed and configured, like Windows Search. You've searched the web and found some other libraries, but they seem a little much for what you need.

Plus, being the coder you are, you'd like to also try to understand just how the full text indexing and searching works (when you have a few spare cycles to dig into that anyway). So you'd like a article with some details, a sample app, and the engine you could embedded in your app, and the source to it all.

Wouldn't all that just a be...

hOOt - full text search engine

hOOt is a extremely small size and fast embedded full text search engine for .net built from scratch using an inverted WAH bitmap index. Most people are familiar with an Apache project by the name of Lucene.net which is a port of the original java version. Many people have complained in the past why the .net version of lucene is not maintained, and many unsupported ports of the original exists. To circumvent this I have created this project which does the same job, is smaller, simpler and faster. hOOt is part of my upcoming RaptorDB document store database, and was so successful that I decided to release it as a separate entity in the meantime.

hOOt uses the following articles :

Based on the response and reaction of users to this project, I will upgrade and enhance hOOt to full feature compatibility with lucene.net, so show your love.

...

Why Another Full Text Indexer?

I was always fascinated by how Google searches in general and lucene indexing technique and its internal algorithms, but it was just too difficult to follow and anyone who has worked with lucene.net will attest that it is a complicated and convoluted piece of code. While some people are trying to create a more .net optimized version, the fact of the matter is that it is not easy to do with that code base. What amazes me is that nobody has rewritten it from scratch. hOOt is much simpler, smaller and faster than lucene.net.

One of the reasons for creating hOOt was for implementing full text search on string columns in RaptorDB - the document store version. Hopefully more people will be able to use and extend hOOtinstead of lucene.net as it is much easier to understand and change.

Features

hOOt has been built with the following features in mind:

  • Blazing fast operating speed (see performance test section)
  • Incredibly small code size.
  • Uses WAH compressed BitArrays to store information.
  • Multi-threaded implementation meaning you can query while indexing.
  • Tiny size only 38kb DLL (lucene.net is ~300kb).
  • Highly optimized storage, typically ~60% smaller than lucene.net (the more in the index the greater the difference).
  • Query strings are parsed on spaces with the AND operator (e.g. all words must exist).
  • Wildcard characters are supported (*,?) in queries.
  • OR operations are done by default (like lucene).
  • AND operations require a (+) prefix (like lucene).
  • NOT operations require a (-) prefix (like lucene).

...

The article continues on and covers just how to use hOOt in your app;

SNAGHTML5955ed

And best of all, the article goes into some depth on just how it works, how the indexing goes, how the results are saved and searched.

SNAGHTML5a4099

So you'd expect this to be some kind of code beast, right? A massively sized project?

Nope.

The zip with the source and same app is 57k.

Here's the Solution (as in this is it);

image

The project works just as expected. Being meta, I used hOOt to index hOOT...

image

So you ever thought that your app would benefit from having full text indexing and searching, but the existing libraries put you off and you didn't want to take a dependency on a third party solution, then hOOt could be the thing you've been hoping for...

Follow the Discussion

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.