When looking for
Windows search util solutions, I stumbled upon
OmniFind, which seemed too good to be true:
Install it in 3 clicks, configure it in minutes.
Free, searches up to 500,000 documents.
Search both the enterprise and the Internet from a single interface.
Incorporates open source Apache Lucene technology to deliver the best of community innovation with IBM's enterprise features.
But OmniFind was exactly like that! Downloading, installing, configuring, testing indexing a website and a filesystem location, all done in 15 minutes!
The server OS requirements are not my favorite, but for the enterprise it makes sense, and expected when it comes to IBM. Their favorites are of course Redhat and Suse. Too bad for me, my favorite Linux being Debian, and of course i always vouch for FreeBSD.
32-bit Red Hat Enterprise LinuxVersion 4, Update 3 32-bit SUSE Linux Enterprise 10 32-bit Windows XP SP2 32-bit Windows 2003 Server SP1 Some notes from the testing so far:
Indexing filesystems, with .doc, .xls, works like a charm, and
the search results can be browsed "as html" and "cached". Very useful!
OmniFind installs as its own
webservice, on a port of your choice. I changed the
search page appearance with company logo and disabled all the Yahoo links. All very simple from the OmniFind admin control panel!
Searching for a string inside any word, you should add a
wildcard. For example you should search "regression*" to make sure you locate occurrancies of "regressions".
Reindexing seems to be something you have to wrap into your own scripts, and schedule them, eg. with at jobs.
You can use scripts to start or stop a crawler.
Crawler management scripts allow you to schedule and execute start and stop crawler actions, or start and stop a crawler from the command line.
Cleaning the index for documents that should not be crawled is not so friendly. It seems you have to delete the entire source, eg. website, then crawl it again. It can be tiresome if it is a big website.
The
language pack should be installed before you start crawling your big sources, as you will have to do it all over again when then language pack has been installed.
Crawling
protected websites was possible, i have tested https:// protected by
basic authentication, it worked fine. Crawling
formbased authentication, as a company portal document handling system, should also be possible:
HTML form-based authenticationForm name (optional)
Example: loginPage
Form action
Example: http://www.example.org/
authentication/login.do
HTTP method: POST or GET
Example: POST
Form parameters (optional)
Example: userid and myuserID So far, I am very pleased with OmniFind, I recommend everyone give it a try.
OmniFind might be the single point of entry for knowledge search that your organization need to bring knowledge from many sources to life and use!!
How would you go about performing #7 without some type of SEM? Ideally, you would combine SEM with NSM, which is what I plan on doing. Any suggestions? I've read through several of your posts regarding CS-MARS, etc. and I can understand how SEMs don't give you enough information to act upon alerts as they are alert-centric and usually don't provide you with session data or full content data, but at least they can point you in the right direction of further investigation. They provide you with what Daniel from the OSSEC project calls a LIDS (log-based intrusion detection system) and then do the job of correlating them from numerous devices. So how would you do the above (#7) without some sort of SEM?
SEM = Security Event Management. HTH