2

We've built some software that searches accounting data (orders, customers, etc) which often has a user-visible number padded with leading zeroes. Users would like to use full text search, but not have to type all those leading zeros. eg: searching for "12345" matches "0000012345"

It seems to me that the most elegant solution would be a custom dictionary that operated on uint tokens. Unfortunately, I am having real difficulty finding any documentation on writing lexer functions. Ideally, I would like to write such a function in SQL or pl/SQL rather than having to resort to maintaining a C extension.

1 Answer 1

6

You can create your own dictionary template. For this you need to create files: zero_dict.c, zero_dict.sql.in, Makefile. And copy them to the directory "contrib/zero_dict".

File zero_dict.c:

#include "postgres.h" #include "fmgr.h" #include "tsearch/ts_public.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif PG_FUNCTION_INFO_V1(dzero_init); Datum dzero_init(PG_FUNCTION_ARGS); Datum dzero_init(PG_FUNCTION_ARGS) { PG_RETURN_POINTER(NULL); } PG_FUNCTION_INFO_V1(dzero_lexize); Datum dzero_lexize(PG_FUNCTION_ARGS); Datum dzero_lexize(PG_FUNCTION_ARGS) { char *in = (char *) PG_GETARG_POINTER(1); int32 len = PG_GETARG_INT32(2); char *txt; TSLexeme *res; int n; if ((n = strspn(in, "0")) != 0 && in[n] != '\0') { txt = pnstrdup(in + n, len - n); res = palloc0(sizeof(TSLexeme) * 2); res[0].lexeme = txt; PG_RETURN_POINTER(res); } else { PG_RETURN_POINTER(NULL); } } 

File zero_dict.sql.in:

SET search_path = public; BEGIN; CREATE OR REPLACE FUNCTION dzero_init(internal) returns internal as 'MODULE_PATHNAME' language C; CREATE OR REPLACE FUNCTION dzero_lexize(internal,internal,internal,internal) returns internal as 'MODULE_PATHNAME' language C with (isstrict); CREATE TEXT SEARCH TEMPLATE zerodict( LEXIZE = dzero_lexize, INIT = dzero_init); END; 

File Makefile:

subdir = contrib/zero_dict top_builddir = ../.. include $(top_builddir)/src/Makefile.global MODULE_big = zero_dict OBJS = zero_dict.o DATA_built = zero_dict.sql DOCS = include $(top_srcdir)/contrib/contrib-global.mk 

Then you need to execute the following commands:

make make install psql DBName < zero_dict.sql 

If you will create the dictionary:

create text search dictionary zerodict (template=zerodict); 

And you can execute the query:

dicts=# select ts_lexize('zerodict', '0000012345'); ts_lexize ----------- {12345} (1 row) 

For more information you can look over: http://www.sai.msu.su/~megera/postgres/fts/fts.pdf

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.