0

In my Postgres 13 DB I want to update a column mother_tongues_id in my table based on name columns.

Names are like "Le Minh Thien" and I want to check if those contain words like 'le' or 'lee' then set nationality to 'Vietnam'. The words check table is some thing like this:

language noun
vietnam +lee+le+long+la+
chinese +lu+zhu+

Here is my UPDATE code:

UPDATE resume SET mother_tongues_id = ( SELECT id FROM mother_tongues WHERE check_language(resume.id) = id limit 1); WHERE mother_tongues_id IS NULL; 

Here is my check_language() function:

CREATE OR REPLACE FUNCTION check_language(rid integer) RETURNS integer AS $BODY$ DECLARE i int = (SELECT (LENGTH(CONCAT(first_name, ' ',last_name)) - LENGTH(replace(CONCAT(first_name, ' ',last_name), ' ', ''))) FROM resume where id = rid); DECLARE sname varchar = ''; DECLARE sid int = 0 LOOP SET i = i - 1; SET sname = SELECT split_part(SELECT CONCAT(first_name, ' ',last_name) FROM resume where id = rid, ' ', i) if select id from mother_tongues WHERE noun ILIKE '%+sname+%' then SET sid = id; end if; IF i = 1 THEN EXIT; END IF; END LOOP; if sid > 0 then RETURN sid; else RETURN NULL; end if; END; $BODY$ LANGUAGE plpgsql; 

I try to create a function in Postgres with loop to split the name and check the words, however always error.

2 Answers 2

4

Your function is extremely inefficient, errors aside.
You certainly don't need a function for this to begin with. A plain UPDATE with a correlated subquery matching with the array overlap operator && does the job:

UPDATE resume SET mother_tongues_id = ( SELECT m.id FROM mother_tongues m WHERE string_to_array(lower(concat_ws(' ', r.first_name, r.last_name)), ' ') && string_to_array(trim(m.noun, '+'), '+') LIMIT 1 ) WHERE mother_tongues_id IS NULL; 

Of course, this is just putting lipstick on a pig. You should really start by fixing your DB schema. As a minimum, convert mother_tongues.noun to text[], so we don't have to do it on the fly. And add a GIN index on that column. See:

Or properly normalize the design.

More fundamentally, deciding nationality based on typical names is a dubious approach. Besides being hardly valid, there can be any number of conflicts, like multiple matches. Currently, LIMIT 1 simply takes an arbitrary match in this case.

If you really need to loop in a PL/pgSQL function (which you don't in this case), read here:

Sign up to request clarification or add additional context in comments.

Comments

0

You have some syntax errors, here is a fix, have a try again.

CREATE OR REPLACE FUNCTION check_language(rid integer) RETURNS integer AS $BODY$ DECLARE i int = (SELECT (LENGTH(CONCAT(first_name, ' ',last_name)) - LENGTH(replace(CONCAT(first_name, ' ',last_name), ' ', ''))) FROM resume where id = rid); sname varchar(255) = ''; sid int = 0; BEGIN LOOP SET i = i - 1; SET sname = (SELECT split_part(CONCAT(first_name, ' ',last_name), ' ', i) FROM resume where id = rid); if select id from mother_tongues WHERE noun ILIKE '%+sname+%' then SET sid = id; end if; IF i = 1 THEN EXIT; END IF; END LOOP; if sid > 0 then RETURN sid; else RETURN NULL; end if; END; $BODY$ LANGUAGE plpgsql; UPDATE resume SET mother_tongues_id = ( SELECT id FROM mother_tongues WHERE check_language(resume.id) = id limit 1 ) WHERE mother_tongues_id IS NULL; 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.