Full text indexing

This is how you set up a full text search for your documents with Swish-E. I got most of the information from an article in the german Linux Magazine and only added some settings.
Preliminaries: You need this script to extract information from *.jpg images and the configuration file with my filter settgins for the following file formats: *.xml *.htm *.html *.tex *.txt *.ps *.pdf *.doc *.rtf *.xls *.jpg *.sxw *.sxc *.sxi *.m *.java *.c *.cpp *.h. Edit the file to add more. To automate the index updating, download this script.
Installation:
apt-get install swish-e jhead libjpeg-progs catdoc xpdf-utils pstotext apache
cp jpginfo /usr/local/bin
chmod 755 /usr/local/bin/jpginfo
cp swish.config /etc
cp /usr/lib/swish-e/*.cgi /usr/lib/cgi-bin
cp /usr/share/doc/swish-e/html/.swishcgi.conf /usr/lib/cgi-bin
cp swish_update /etc/cron.daily
chmod 755 /etc/cron.daily/swish_update
ln -s /data /var/www
chown www-data /var/www/data
Usage: Run the update script and point your browser to
http://localhost/cgi-bin/swish.cgi