Publisher's Description:
From Tran Nam Quang
DocFetcher is an Open Source desktop search application: It allows you to quickly access documents on your computer by typing keywords. - You can think of it as Google for your local document repository. The application is currently available for Windows and GTK-based Linux distributions.
How It Works
* You specify one or more folders to make searchable, e.g. "C:MyDocuments".
* DocFetcher extracts the text from all documents in "C:MyDocuments" that it is able to read, e.g. HTML, MS Word and PDF, and stores the result of this processing in "index files". The indexing process might take a few minutes (600 documents = about 3 min.).
* Now you can type keywords into DocFetcher's search box, e.g. "fourier analysis", hit Enter, and DocFetcher will list all documents inside "C:MyDocuments" that contain these words - most of the time in less than a second.
* What if the original document repository is changed? Then the index files will get out of sync with the repository, obviously. However: (1) DocFetcher can listen to file system events and automatically update its index files when it's running. (2) In constrast to completely (re-)building an index, an index update is usually a matter of seconds.
Supported Document Formats
* HTML and plain text (both customizable)
* Portable Document Format (pdf)
* Microsoft Office Word (doc), Excel (xls) and PowerPoint (ppt)
* OpenOffice.org Writer, Calc, Draw and Impress
* Rich Text Format (rtf)
* AbiWord (abw, abw.gz, zabw)
* Microsoft Compiled HTML Help (chm)
* Microsoft Visio (vsd)
* (In the works: MS Office 2007)
Features
* Detection of HTML pairs (e.g. "foo.htm" and a folder named "foo_files")
* Various file operations on the document repository (e.g. creating folders, inserting new files) can be performed through DocFetcher's interface.
* Customizable text and HTML file extensions (e.g. "nfo", "cpp", "java", "py", "shtml", and so on)
* Regular expression based exclusion of files from indexing
* Automatic index updates on changes to the indexed documents (optional).
* Preview panel with search-term highlighting and a simple built-in web browser
* Search results can be sorted and filtered by different criteria (filetype, filesize, path, etc.).
* A portable version for both Windows and Linux is available, which, amongst other things, is useful in combination with volume encryption (TrueCrypt).