Tuesday, October 16, 2007

Intranet and file system search tools on Windows

Recently I have looked into challenges and requirements for search tools for knowledge management. In my testing, I have been focussing on tools that could run off a Unix box, indexing serveral sources of information. Testing those tools are still undergoing.

Now I have another use for search tools, this time running off a Windows server. Requirements for eg. what sources to index are the same as the Unix tools still being tested.

Using the very good searchtools.com website, I found some interesting tools:
  • Mnogosearch Windows
  • Zoom search engine
  • Apache Solr
  • OnmiFind
So far I have setted up the Mnogosearch for Windows MSSQL with SQL Express 2005, but I still have to setup search integration into IIS. I have stalled this test, mainly because of the price! It is so very expensive, I could almost get a GSA mini instead. For testing the trial version indexing 1 kb of data from each file is okay, but its just too expensive to put more work into. Add to that, it seems that the Windows version is falling behind in releases, does not seem to be maintained very much.

I have not tested Apache Lucene Solr yet. It can become hard to test for me, as it is Java based, and I dont have a ready to run test environment for such testing. Reading on Solr, it should be able to index intranet, hopefully shared drives too, but i have to look at it!

OmniFind, like Solr, is based on Lucene, but seems like a better package for me to test. It is free, can index file system and sounds too good to be true:
  • Install it in 3 clicks, configure it in minutes.
  • Searches up to 500,000 documents.
  • Search both the enterprise and the Internet from a single interface.
  • Incorporates open source Apache Lucene technology to deliver the best of community innovation with IBM's enterprise features.

I have installed the Zoom search engine on my laptop, indexing the directory with some .doc, .txt, .cmd etc files, putting the result search page to an IIS webserver! Simple and working! In the free version Zoom will only index static files, and a max of 50 documents. This is annoying, I would rather have full version in eg. 30 days! Notes so far:

  • Cheap, $99 for pro, $299 for enterprise use.
  • Very easy setup
  • Search does not trigger documents which have the searched word in filename!
  • Can reindexing be automated?

No comments: