I'm really at my wits end, with this Problem, and I really hope someone could help me. I am using a Postgresql 9.3. My Database contains mostly german texts but not only, so it's encoded in utf-8. I want to establish a fulltextsearch wich supports german language, nothing special so far. But the search is behaving really strange,, and I can't find out what I am doing wrong.
So, given the following table given as example
select * from test; a ------------- ein Baum viele Bäume Überleben Tisch Tische Café \d test Tabelle »public.test« Spalte | Typ | Attribute --------+------+----------- a | text | sintext=# \d Liste der Relationen Schema | Name | Typ | Eigentümer --------+---------------------+---------+------------ (...) public | test | Tabelle | paf Now, lets have a look at some textsearch examples:
select * from test where to_tsvector('german', a) @@ plainto_tsquery('Baum'); a ------------- ein Baum viele Bäume select * from test where to_tsvector('german', a) @@ plainto_tsquery('Bäume'); --> No Hits select * from test where to_tsvector('german', a) @@ plainto_tsquery('Überleben'); --> No Hits select * from test where to_tsvector('german', a) @@ plainto_tsquery('Tisch'); a -------- Tisch Tische Whereas Tische is Plural of Tisch (table) and Bäume is plural of Baum (tree). So, Obviously Umlauts does not work while textsearch perfoms well.
But what really confuses me is, that a) non-german special characters are matching
select * from test where to_tsvector('german', a) @@ plainto_tsquery('Café'); a ------ Café and b) if I don't use the german dictionary, there is no Problem with umlauts (but of course no real textsearch as well)
select * from test where to_tsvector(a) @@ plainto_tsquery('Bäume'); a ------------- viele Bäume So, if I use the german dictionary for Text-Search, just the german special characters do not work? Seriously? What the hell is wrong here? I Really can't figure it out, please help!