5

is there a magic function or operator to ignore some tokens?

select to_tsvector('the quick. brown fox') @@ 'brown' -- returns true select to_tsvector('the quick,brown fox') @@ 'brown' -- returns true select to_tsvector('the quick.brown fox') @@ 'brown' -- returns false, should return true select to_tsvector('the quick/brown fox') @@ 'brown' -- returns false, should return true 

1 Answer 1

6

I'm afraid that you are probably stuck. If you run your terms through ts_debug you will see that 'quick.brown' is parsed as a hostname and 'quick/brown' is parsed as filesystem path. The parser really isn't that clever sadly.

My only suggestion is that you preprocess your texts to convert these tokens to spaces. You could easily create a function in plpgsql to do that.

nicg=# select ts_debug('the quick.brown fox'); ts_debug --------------------------------------------------------------------- (asciiword,"Word, all ASCII",the,{english_stem},english_stem,{}) (blank,"Space symbols"," ",{},,) (host,Host,quick.brown,{simple},simple,{quick.brown}) (blank,"Space symbols"," ",{},,) (asciiword,"Word, all ASCII",fox,{english_stem},english_stem,{fox}) (5 rows) 

As you can see from the above you don't get tokens for quick and brown

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.