Start > Linux Tips & Trics > PostgreSQL 8.2 with tsearch2 and Dutch Snowball stemmer

PostgreSQL 8.2 with tsearch2 and Dutch Snowball stemmer

13 juni 2007

quick walk-through for compiling PostgreSQL with tsearch2 full text extension and Dutch Snowball stemmer on Debian Etch:

  1. sudo su -
  2. cd /usr/src
  3. apt-get build-deps postgresql-8.1
  4. wget ftp://ftp4.nl.postgresql.org/postgresql.zeelandnet.nl/latest/postgresql-8.2.4.tar.bz2
  5. tar jxvf postgresql-8.2.4.tar.bz2
  6. cd postgresql-8.2.4
  7. wget http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/tsearch_snowball_82-20070504.gz
  8. gunzip tsearch_snowball_82-20070504.gz
  9. patch -b -p0 < tsearch_snowball_82-20070504
  10. ./config
  11. make
  12. make install
  13. cd contrib
  14. make
  15. make install
  16. cd tsearch2/gendict
  17. wget http://snowball.tartarus.org/algorithms/dutch/stem.c
  18. wget http://snowball.tartarus.org/algorithms/dutch/stem.h
  19. ./config.sh -n nl -s -p dutch_ISO_8859_1 -v -C‘Dutch Stemmer. Snowball’
  20. cd ../../dict_nl
  21. make
  22. make install
  23. wget http://snowball.tartarus.org/algorithms/dutch/stop.txt -O /usr/local/pgsql/share/contrib/dutch.stop
  24. adduser postgres
  25. mkdir /usr/local/pgsql/data
  26. chown postgres /usr/local/pgsql/data
  27. su – postgres
  28. /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
  29. /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >/tmp/logfile 2>&1 &
  30. /usr/local/pgsql/bin/createdb tsearch2
  31. /usr/local/pgsql/bin/psql tsearch2 < /usr/local/pgsql/share/contrib/tsearch2.sql
  32. /usr/local/pgsql/bin/psql tsearch2 < /usr/local/pgsql/share/contrib/dict_nl.sql
  33. /usr/local/pgsql/bin/psql tsearch2

If everything went well you are now in a PostgreSQL prompt.

We will now update ans insert some stuff:

  1. UPDATE pg_ts_dict SET dict_initoption=‘contrib/dutch.stop’ WHERE dict_name=‘nl’;
  2. INSERT INTO pg_ts_cfg (ts_name, prs_name, locale) VALUES (‘dutch’, ‘default’, ‘nl_NL’);
  3. INSERT INTO pg_ts_cfgmap (SELECT ‘dutch’, tok_alias, dict_name FROM pg_ts_cfgmap WHERE ts_name=‘default’);
  4. UPDATE pg_ts_cfgmap SET dict_name=‘{nl}’ WHERE ts_name=‘dutch’ AND dict_name=‘{en_stem}’;
select to_tsvector('dutch', 'ik ga naar school');    to_tsvector
------------------
'ga':2 'schol':4
(1 row)
Categorieën:Linux Tips & Trics Tags:
Geen reacties mogelijk.