|
This is how you set up a full text search for your documents with
Swish-E. I got most of the information from an
article
in the german Linux Magazine and only added some settings.
|
|
Preliminaries:
You need this script to extract information
from *.jpg images and the configuration
file with my filter settgins for the following file formats: *.xml *.htm *.html
*.tex *.txt *.ps *.pdf *.doc *.rtf *.xls *.jpg *.sxw *.sxc *.sxi *.m *.java *.c
*.cpp *.h. Edit the file to add more. To automate the index updating, download
this script.
|
Installation:
apt-get install swish-e jhead libjpeg-progs catdoc xpdf-utils pstotext apache
cp jpginfo /usr/local/bin
chmod 755 /usr/local/bin/jpginfo
cp swish.config /etc
cp /usr/lib/swish-e/*.cgi /usr/lib/cgi-bin
cp /usr/share/doc/swish-e/html/.swishcgi.conf /usr/lib/cgi-bin
cp swish_update /etc/cron.daily
chmod 755 /etc/cron.daily/swish_update
ln -s /data /var/www
chown www-data /var/www/data
|
Usage:
Run the update script and point your browser to
http://localhost/cgi-bin/swish.cgi
|