Pack rats take heart: Desktop search engines let you sift your information
When it comes to keeping old data around, especially in inboxes, most users are pack rats to some degree. Deleting old messages is tedious, and most people only do it when forced to free up some space in the mailbox. Mailbox quotas have grown steadily since Microsoft first launched Exchange Server in 1996, and today, most corporate mailboxes allow between 150MB and 300MB, not counting the collection of Personal Store files (PSTs) that most Microsoft Outlook users accumulate over time. Cached Exchange Mode in Microsoft Office Outlook 2003 lets users keep a complete local copy of their Exchange mailboxes. And larger PC disks can keep even more data. So why worry? Just store, store, store. This unchecked accumulation is fine until a message that a user needs becomes a needle in a vast data haystack. Fortunately, Google and Microsoft both offer desktop search engines that can help users dig through that haystack. Which tool should you consider? Let's take a look at how Microsoft Lookout and Google Desktop stack up.
Microsoft Lookout is a free Outlook add-on that provides better indexing and retrieval capabilities than Outlook's standard Find, Advanced Find, or Search Folders features can provide. Microsoft hasn't said when the technology will appear in its products, although a recently released new version of Windows Desktop Search (available at http://toolbar.msn.com) completes with Google Desktop. (Windows Desktop Search has an optional Adobe add-in that lets you search PDFs.) Lookout's relative maturity (Microsoft acquired the product and its developer in July 2004) makes it a good bet that it will be included in Outlook soon, but in the interim, you can use Lookout. Be aware that the original Lookout developers warn that you can't expect formal support, patches, or enhanced versions because they're busy integrating their work with Microsoft products. However, I haven't experienced any major problems with Lookout in the last year; the product seems pretty stable.
Google is synonymous with search in the minds of most computer users, so Google Desktop (launched in October 2004 with a second version released in August 2005)—with its ability to search Outlook data alongside files and the Web—gives Microsoft significant competition in the desktop search space. Table 1 shows a comparison of the two products' features. Let's take a closer look at their functionality.
Installing and Indexing
You can download Lookout 1.2 at http://www.lookoutsoft.com/Lookout/lookoutinfo.html; Google Desktop is available from http://desktop.google.com. Lookout supports Outlook 2000 and later, but before you can install Lookout, you need to install the .NET Framework 1.1 (available at http://msdn.microsoft.com/netframework/downloads/framework1_1). Google Desktop supports Windows 2000 Service Pack 3 (SP3) and Windows XP and doesn't require the .NET Framework. But apart from email, Google Desktop supports indexing only of pages held in the browser cache and supports both Internet Explorer (IE) and the Mozilla Firefox browser. Google also indexes hard-disk files, including Microsoft Office documents. Both products install painlessly and quickly. After installation, Outlook loads Lookout as a COM add-in every time Outlook starts. Google Desktop runs as a personal Web server that uses three processes and loads a personal browser helper DLL into IE.
Lookout's normal operating model is to index data when Outlook is inactive, meaning that it adds new items to the index by waking up hourly, but only if Outlook isn't busy. By default, Lookout rebuilds the complete index from scratch weekly to ensure that the index is up to date and valid. Lookout signals indexing activity through discrete pop-ups, which users can suppress. Users can use the settings on the Lookout Options dialog box's Index tab to control when Lookout indexes data. As Figure 1 shows, you can also use these settings to specify which files Lookout indexes. Because the product is an Outlook add-in, the obvious place to begin is to index Outlook data, both in Exchange folders (mailbox and public folders) and PSTs. You can add sources to the index by clicking Add Outlook (to specify folders in mailboxes, public stores, and PSTs) or Add Files (to add files on any disk that your PC can access).
Google Desktop waits for a low level of system activity, then performs its indexing in much the same way as Lookout does. However, the Google Desktop Preferences dialog box offers fewer options to control the program. Users can select which data sources to index and, in a rudimentary fashion, which to exclude. Google Desktop isn't tied to Outlook in the same way as Lookout is, so it can also index Microsoft Outlook Express messages. Note that Google asks you to send them non-personal usage data and crash information. I opted to disable this choice, and I think many other users will do the same, mostly because of privacy concerns. A better definition of non-personal usage data would help convince me to cooperate.
The biggest difference between Lookout and Google Desktop is Google's close integration with the browser, which lets the product intercept search requests as they're made, split them between local and Web data (sending the request on to Google to search the Web), then integrate the two streams of returned data into a seamless response. Also, Google Desktop doesn't index email attachments or the contents of public folders, whereas Lookout does. If you let it, Google Desktop can index AOL Instant Messenger (AIM) online chats (but not Microsoft Instant Messenger). I don't think many people will choose this option because online chats have always been considered a transient, almost throwaway, kind of communication. But if you opt to index IM conversations, you can launch new chats with correspondents from remembered conversations.
By default, Google Desktop indexes all local disks. This approach might create a problem if you use the tool on a shared PC because the index might incorporate data from sources that you don't want to include in search results.
Both programs begin indexing data immediately after installation, but Lookout gives you a finer degree of control over the indexing operation than Google Desktop does. For example, if you want to build an index immediately, you can simply click Indexer in the Lookout menu, then click Start. Lookout also displays a progress bar and reports details as indexing proceeds. If the PC is inactive, indexing proceeds quickly. In my case, Lookout indexed 23,520 documents across a range of Exchange and disk folders in about 10 minutes on a PC equipped with a 1.6GHz Pentium M and 512MB of RAM. I used Outlook 2003 running in Cached Exchange Mode, so all data was local. If you index data in public folders that aren't synchronized with the local cache or use a version of Outlook that doesn't support Cached Exchange Mode, indexing will be slower because Lookout will need to connect to the Exchange server to read the content.
Google Desktop waits for the PC to be inactive for 30 seconds before it begins to index. In my test, Google Desktop took significantly longer than Lookout to build the initial index, particularly for email. In addition, Google Desktop slowed my PC more when it was building the index than Lookout did. This effect on performance might be of concern if you run Google Desktop on older hardware. Some of the slowing can be explained by the fact that Google Desktop indexes more data than Lookout (and remember, you have less control over what Google Desktop indexes than you have with Lookout). Lookout is faster at indexing email, probably because of its close integration with Outlook, providing faster data access than the Web-based Google Desktop.
Once the initial index is built, both programs do a reasonable job of inserting new data-source items into the index. Google Desktop adds new items faster than Lookout and includes email messages in search results minutes after they're sent or received—quite impressive. By comparison, Lookout can take an hour or more to include a new item in the index. Neither program is quick to remove deleted items; you might have to wait for a complete rebuild (by default, a weekly task for Lookout) before deleted items are removed. It's hard to say just when Google Desktop gets around to removing deleted items because the program doesn't provide a UI to control when the program flushes old data. However, if you find obsolete data, you can remove it from the index by using the Remove option.
Figure 2 shows the Lookout Options dialog box's Advanced tab. This tab lets you specify where Lookout stores its index files and choose the type of files to index (typically Office file types). The index files reside in the \Reader folder under the root reported by Lookout, so the full path to the folder is \Documents and Settings\Username\Local Settings\Application Data\LookoutSoftware\Lookout\Data.Outlook\Index\Reader. Figure 2 also shows the list of file types that Lookout will index. The file types list appears in an editable field, so I attempted to add "pdf" to the list so that Lookout would index the many Adobe PDF files in my mailbox and on my disk. Unfortunately, this attempt didn't work. The Lookout Help file didn't tell me how to include additional file types in an index, so I assume that Lookout requires updated software before it can read and index new file types, just as Microsoft SharePoint Portal Server requires an iFilter to support file types for indexing. Because Lookout is now in maintenance mode, I don't expect Microsoft to add functionality in this area.
The default location for Google Desktop index files is C:\Documents and Settings\Username\Local Settings\Application Data\Google\Google Desktop Search, but you can set an alternative location in the registry at HKEY_CURRENT_USER\Software\Google\Google Desktop\data_dir. Google Desktop 1.0 can't index PDFs either, but it can index the file names of PDFs and other files that it finds on your disk, and it adds this data to the index so that you can search by filename. The recently released Google Desktop 2.0 does recognize PDFs, as well as a slew of other file types that Google has added.
The size of index files depends on what data you index. For example, I index my complete mailbox and a set of folders on my local disk. When I last measured, my mailbox contained 19,852 items, or 943MB of data, and the folders that Lookout indexed contained 3628 files spanning 5.92GB of data. The resulting index files span 68.5MB, or roughly 1 percent of the indexed data. On my PC, the Google Desktop files occupy roughly twice as much disk space as the Lookout files. However, this comparison isn't fair because Google Desktop indexes far more data than Lookout. You can find out how many files are in the index by right-clicking the Google Desktop Search icon in the desktop tray and selecting Status.
To perform a search, users input words or phrases into the each tool's text box. Figure 3 shows a Lookout search for any item that includes RSG and cached Exchange mode. The items at the top of the list most closely match the search criteria. Users can double-click an item in the list to view its content (unless the item has been moved or deleted since Lookout indexed it). Users can further limit searches by specifying a time or other restriction that Lookout supports, including Only show items from the last week or last month, Only show items sent by people who appear in your Outlook Contacts, Only show items stored in Outlook folders, Only show items stored on the PC drives that you index, and Only show messages with attachments. Note that when users pass the mouse over an email message in the list of returned items, Lookout displays the first couple of lines of text from the message content to help them determine whether it's the desired item.
To handle complex queries—for example, a message from Kieran McCorry, with a Word attachment that includes ExBPA in the text—Lookout provides a Search Builder, which you access by clicking the lightning icon next to the Search for field. The Search Builder, which Figure 4 shows, lets you build complex searches that combine different criteria to narrow the search. However, the Lookout Help file provides only rudimentary information about how to build complex searches, so the Search Builder can take some time to master.
Searching with Google Desktop is exactly the same as executing any Google search, which is part of the elegance and power of the implementation. Input a search term, and the results appear almost instantly. As Figure 5 shows, a search for items that mentioned EXBPA and Foster got nine hits: seven messages, one file, and one cached Web page. Date order is the default, but you can switch to relevance order by clicking the Sort by relevance option. Like its Web counterpart, Google Desktop supports Boolean operators and many of the keywords that Google fans already know. For example, Google Desktop lets you limit searches to Word documents by specifying filetype:DOC, or you can exclude anything in the search that refers to security by including -security. And you can limit searches to email by including the filetype:email keyword in the search. Google Desktop is a powerful search engine, but dealing with it is rather like being a UNIX administrator all over again: You have to know command-line switches and arcane syntax to get full value out of the engine. Most Outlook users will probably prefer the UI of Lookout's Search Builder. Google Desktop would benefit from an advanced search page that lets users more easily build queries.
When you select an email message, Google Desktop displays a cached copy of the message's content, an HTML version of the message with all formatting removed. You can reply, forward the message, or view its formatted version in Outlook as long as the message is still in the location it was in when Google indexed it. Even if you've deleted the message, the cached copy remains available until Google removes it from the cache. Another useful feature of Google Desktop is its ability to group messages with the same subject into conversation threads so that you can read all the contributions at one time, as Figure 6 shows.
The Best of Both Worlds
Desktop search engines are far from perfect. (For a word of caution about possible performance affects of using these products, see the Web sidebar "Caution!" at InstantDoc ID 47709.) The ideal desktop search tool would combine the characteristics of both Lookout and Google Desktop. Lookout is a more Outlook-centric program, and its UI makes Lookout more approachable and easier to use in some respects, especially in building complex queries. Google Desktop seamlessly merges personal data with the Web and provides a simple, fast way to navigate a morass of information. Both programs reward those who use their advanced features, so it's worth your while to show users how to drill down through the masses of hits that an initial search can produce to find the information that they really need. I suspect that many people will install and use both programs and track the developments that Microsoft and Google make as they deliver their finished products. Neither product is perfect, but both are free, and both let users organize life just a little better.