2

I'm looking for some assistance in debugging a REGEXP_REPLACE() statement in Snowflake.

I wanted to replace |(pipe) between double quoted string only with #.

Example:

"Foreign Corporate| Name| Registration"|"99999"|"Valuation Research" 

Required Result:

"Foreign Corporate# Name# Registration"|"99999"|"Valuation Research" 

I have tried regex101.com with (?!(([^"]"){2})[^"]*$)[|] and substitution\1#, works, but doesn't work in Snowflake.

2 Answers 2

2

The regexp functions in Snowflake do not lookahead and lookbehind. If you want to use regular expressions with lookahead and lookbehind functions, you can do so in a JavaScript UDF.

Note that the regular expression here finds all the pipes including those inside double quotes. I was able to find a regular expression that finds pipes outside double quotes, which is why this UDF splits by those findings and rejoins the string. If you can find a regular expression that finds the pipes inside rather than outside the double quotes, you can simplify the UDF. However, splitting it allows other possibilities such as removing wrapping quotes if you want to do that.

set my_string = '"Foreign Corporate| Name| Registration"|"99999"|"Valuation Research"'; create or replace function REPLACE_QUOTED_PIPES(STR string) returns string language javascript as $$ const search = `(?!\\B"[^"]*)\\|(?![^"]*"\\B)`; const searchRegExp = new RegExp(search, 'g'); var splits = STR.split(searchRegExp); var out = ""; var del = "|"; for(var i = 0; i < splits.length; i++) { if (i == splits.length -1) del = ""; out += splits[i].replace(/\|/g, '#') + del; } return out; $$; select REPLACE_QUOTED_PIPES($my_string); 
Sign up to request clarification or add additional context in comments.

1 Comment

This works great! Thanks for your help. Appreciate your response.
1

Different approach, just using REPLACE

  1. Replace "|" with a string that will never appear in your data. I've used @@@ in my example
  2. Replace the remaining pipes with #
  3. Replace the dummy string, @@@, back to the original value "|"

e.g.

replace(replace(replace(sample_text,'"|"','@@@'),'|','#'),'@@@','"|"') 

SQL statement to show each step:

select sample_text ,replace(sample_text,'"|"','@@@') r1 ,replace(replace(sample_text,'"|"','@@@'),'|','#') r2 ,replace(replace(replace(sample_text,'"|"','@@@'),'|','#'),'@@@','"|"') r3 from test_solution; 

2 Comments

That's a very smart approach for achieving the main objective rather than trying complex regex, thump up!
Thanks for your response.This works if there are "s before and after the pipe. I also have a case where there are no "s before and after the pipe. Example: "Foreign Corporate| Name| Registration"|"99999"|Test|Test123.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.