0

I'm looking for some assistance in debugging a REGEXP_REPLACE() statement.

I have been using an online regular expressions editor to build expressions, and then the SF regexp_* functions to implement them. I've attempted to remain consistent with the SF regex implementation, but I'm seeing an inconsistency in the returned results that I'm hoping someone can explain :)

My intent is to replace commas within the text (excluding commas with double-quoted text) with a new delimiter (#^#).

Sample text string:

"Foreign Corporate Name Registration","99999","Valuation Research",,"Active Name",02/09/2020,"02/09/2020","NEVADA","UNITED STATES",,,"123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES","123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES",,,,,,,,,,,, 

RegEx command and Substitution (working in regex101.com):

([("].*?["])*?(,) 
\1#^# 

regex101.com Result:

"Foreign Corporate Name Registration"#^#"99999"#^#"Valuation Research"#^##^#"Active Name"#^#02/09/2020#^#"02/09/2020"#^#"NEVADA"#^#"UNITED STATES"#^##^##^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^##^##^##^##^##^##^##^##^##^##^##^# 

When I try and implement this same logic in SF using REGEXP_REPLACE(), I am using the following statement:

SELECT TOP 500 A.C1 ,REGEXP_REPLACE((A."C1"),'([("].*?["])*?(,)','\\1#^#') AS BASE FROM "<Warehouse>"."<database>"."<table>" AS A 

This statement returns the result for BASE:

"Foreign Corporate Name Registration","99999","Valuation Research",,"Active Name",02/09/2020,"02/09/2020","NEVADA","UNITED STATES",,,"123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES","123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES"#^##^##^##^##^##^##^##^##^##^##^##^# 

As you can see when comparing the results, the SF result set is only replacing commas at the tail-end of the text.

Can anyone tell me why the results between regex101.com and SF are returning different results with the same statement? Is my expression non-compliant with the SF implementation of RegEx - and if yes, can you tell me why?

Many many thanks for your time and effort reading this far!

Happy Wednesday, Casey.

0

1 Answer 1

0

The use of .*? to achieve lazy matching for regexing is limited to PCRE, which Snowflake does not support. To see this, in regex101.com, change your 'flavor" to be anything other than PCRE (PHP); you will see that your ([("].*?["])*?(,) regex no longer achieves what you are expecting.

I believe that this will work for your purposes:

REGEXP_REPLACE(A.C1,'("[^"]*")*,','\\1#^#') 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.