Software

LINUX PICKS AND PANS

Recoll Looks High, Looks Low, Finds Your File With Ease

Not all search tools are the same. Just like users have a variety of Linux distros that appeal to a wide range of needs, search apps do different tasks for different users.

Recoll

Recoll

One of the best search tools I have found is a clever app called “Recoll.” I have an extensive collection of research and published articles, plus countless files built from years of lecturing, writing and creating content. These are stored on both internal and external hard drives.

I get consistently more refined search hits using Recoll than what is produced in searches performed with apps such as Tracker, Strigi and SearchMonkey, just to name a few search tools. But for the record, I find Searchmonkey to be a close second in ranking search tools I have used.

Recoll is jack-rabbit fast and has a clean, intuitive interface. Plus, it does not drain system resources.

Zappy Underpinnings

Recoll, lightweight by structure but not in its results, is powered by the xapian-core, with an added boost from Qmake and Qt.

The Xapia Project is an open source probabilistic information retrieval library. Its search engine library is written in C++. Its bindings allow the use from other languages as well.

After its lengthy first-use indexing, you can easily and quickly refresh the index cue by selecting File/Update Index or using the keyboard combination of Alt F/Alt I. Recoll stores its index in the ~/.recoll/xapiandb/ directory. By default, the indexing process begins from your home directory, including any mounted partitions or SMB shares.

Refining Moments

It’s easy to customized the search locations Recoll uses. For instance, go to the Recoll preferences menu to specify which directories Recoll should index. In this same configuration panel you can designate which files to ignore.

The process involves highlighting an existing location and clicking the minus button. Clicking the plus button opens a file manager window to select new locations. The same quick and simple procedure works for selecting file types to filter out of searches.

Separate tabs give you access to configuring global and local parameters along with Web history. You can also set up the Recoll index scheduling and set other conditions such as Aspell checker, use symbolic links and more.

Its indexer runs two ways. One is a thread inside the graphical user interface or GUI. The other is externally as a cron-based program.

Ample Features

Recoll supports an impressive set of file types and compressed formats. These include both native and external support for text, HTML, OpenOffice, MS Office, PostScript, MP3 and other audio files and JPEG. Add to this list formats including PDF (pdftotext) and RTF (unrtf) formats.

Powerful query facilities let you conduct Boolean searches. This includes searching for phrases and filtering on file types and directories.

Recoll also has support for multiple character sets and provides storage using Unicode UTF-8. It can switch stemming language after indexing. It requires no database setup, Web server or exotic search language terms.

Fancy Face

One of my biggest attractions to Recoll is its pretty face. Well, there is beauty in simplicity. That is the basis for Recoll’s GUI.

Menu categories are File, Tools, Preferences and Help. Each offers just the essential tools for working with the search tasks. The bottom portion of the interface is a spacious viewing window to see results.

Below the menu row is a scant array of icons. These provide fast access to the Advanced/Complex Search tool, document history, the Term Explorer Tool and several navigational arrows. Click on the last icon in the row to to show the search results as a table.

The third rows holds buttons to clear the search window, the search entry line and its search button, plus a button to select the type of search to perform.

Facial Closeup

The drop down menus are deliberately thin, but not bare. For example, Recoll avoids a cluttered luck by providing just what is needed with no bloat.

Erasing search and document history falls in the File menu. Handy additions are options to show missing helpers and view in full screen. The Help menu also has the option to show missing helpers.

The Tools menu list has the same options as are available in the icon row below it. The Preferences menu lets you access the Indexing configuration, query configuration and the External index dialog. It also lets you select stemming or the languages to include.

Pondering Terms

I have used Recoll for an extended time. I have become dependent on the Term Explorer Tool. This ingenious feature lets me search the full index terms list. This is very useful when I can not remember the exact spelling or only know the beginning of the item’s name.

This tool has three modes of operation. In Wildcard Mode I can search with shell-like wildcards such as *, ? and []. In Regular Mode I can use a search term that will keep that as the anchored root of all hits and automatically exclude words with prefixes before that term.

Two other modes for the Term Explorer Tool are Stem Expansion and Spelling/Phonetic. Stem Expansion is often the most useful as it reacts to the usual user input processing. The spelling mode includes a best guess usage for words in the index that seem to be a close match of the misspelled search term.

Searching Skills

I am by no means an expert in search string tactics. That does not matter to Recoll. To do a simple search all that is needed is a search term entered into the window. Tell Recoll which query language (search term, all terms or any item) to apply and then click the Search button.

Right-click on any listing in the search results pane. This opens a choice window for further action. You can select preview, open, copy file name, URL, locate similar files and preview or open parent document/folders.

Performing an Advanced Search is just as simple. Start by clicking the first icon in the toolbar row. This allows defining more precise criteria. Enter choices in the various entry windows and select from the drop-down choices. These provide an extensive combination of fields, captions, extensions, keywords, recipient and author names.

You can further select a directory from the Browse windows. Also, you can restrict file types and add them to the exceptions list from the “Restrict File Types” section at the bottom.

Bottom Line

Recoll is a powerful yet simple-to-use full-text desktop search tool that indexes the contents of many file formats. You can perform simple searches as well as advanced operations like searching for the author, file size, file format as well as operators like “AND” or “OR.”

This search tool needs almost no setup and has nearly no learning curve. It gives superior results whether you are a skilled search pro or a search-challenged novice.

Jack M. Germain has been writing about computer technology since the early days of the Apple II and the PC. He still has his original IBM PC-Jr and a few other legacy DOS and Windows boxes. He left shareware programs behind for the open source world of the Linux desktop. He runs several versions of Windows and Linux OSes and often cannot decide whether to grab his tablet, netbook or Android smartphone instead of using his desktop or laptop gear.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Related Stories
More by Jack M. Germain
More in Software

LinuxInsider Channels