0

I have two tables temp and md respectively. There is a field called uri_stem which has certain details that I want to omit from temp but not from md. I need to make a comparison that is able to compare certain patterns and remove them from temp if there are similar patterns in md.

Right now I am using this code to remove data similar to the patterns I want to omit, but I want some method that is able to compare the patterns from the md table rather than me hardcording each one. Hope the explanation is clear enough.

FROM spfmtr01.tbl_1c_apps_log_temp where uri_stem not like '%.js' and uri_stem not like '%.css' and uri_stem not like '%.gif' and uri_stem not like '%.png' and uri_stem not like '%.html' and uri_stem not like '%.jpg' and uri_stem not like '%.jpeg' and uri_stem not like '%.ico' and uri_stem not like '%.htm' and uri_stem not like '%.pdf' and uri_stem not like '%.Png' and uri_stem not like '%.PNG' 
1
  • This should help you partially Commented Jun 11, 2014 at 10:51

2 Answers 2

0

This example is based on answer I mentioned in my comment.

SQLFiddle

Sample data:

drop table if exists a, b; create table a (testedstr varchar); create table b (condstr varchar); insert into a values ('aa.aa.jpg'), ('aa.aa.bjpg'), -- no match ('aa.aa.jxpg'), -- no match ('aa.aa.jPg'), ('aa.aa.aico'), -- no match ('aa.aa.ico'), ('bb.cc.dd.icox'), -- no match ('bb.cc.dd.cco'); -- no match insert into b values ('jpg'), ('ico'); 

Explanation:

  • in table a we have strings we would like to test (stored in column testedstr)
  • in table b we have strings we would to like to use as testing expresions (stored in column condstr)

SQL:

with cte as (select '\.(' || string_agg(condstr,'|') || ')$' condstr from b) select * from a, cte where testedstr !~* condstr; 

Explanation:

  • in the first line we will aggregate all patterns we would like to test into one string; as a result we will get jpg|ico string (aggregated into single row).
  • in the second line we crossjoin tested table with our testing expression (from the first line) and use regular expression to perform the test.
  • the regular expression at the end looks like \.(jpg|ico)$

For older versions, you should use answer provided by @Bohemian. For my sample data it would look like (adjusted for multiple possible dots) this (SQLFiddle:

select * from a where lower(reverse(split_part(reverse(testedstr),'.',1))) not in (select lower(condstr) from b) 

Without reverse function (SQLFiddle):

select *, lower(split_part(testedstr,'.',length(testedstr)- length(replace(testedstr,'.','')) + 1)) as extension from a where lower(split_part(testedstr,'.',length(testedstr)- length(replace(testedstr,'.','')) + 1)) not in (select lower(condstr) from b) 
Sign up to request clarification or add additional context in comments.

5 Comments

@user3455309 Exactly as it is written, did you check example on SQLFiddle?
Hi, Thanks for the help but we use a very low version of Postgresql. 8.2 I guess so it's not supporting cte. Any other alternative to this code?
@user3455309 Option 1: Upgrade, Option2: See my edited answer.
Thanks. While upgrade sounds a real good option its unfortunately not in my hands. I tried your query but its throwing an error 'function reverse(character varying) does not exist'.
@user3455309 Yeah, my fault, there is no reverse function in ancient versions. See my updated answer, this time I checked that all functions are supported by 8.2
0

First let's refactor the many conditions into just one:

where lower(substring(uri_stem from '[^.]+$')) not in ('js', 'css', 'gif', 'png', 'html', 'jpg', 'jpeg', 'ico', 'htm', 'pdf') 

In this form, it's easy to see how the list of values can be selected instead of coded:

where lower(substring(uri_stem from '[^.]+$')) not in ( select lower(somecolumn) from sometable) 

Note the use of lower() to avoid problems of dealing with variants of case.

You could also code it as a join:

select t1.* from mytable t1 left join sometable t2 on lower(somecolumn) = lower(split_part(uri_stem, '.', 2)) where t2.somecolumn is null -- filter out matches 

4 Comments

This assumes that there is only single dot in the string. Wouldn't it be better to use something like select lower(reverse(split_part(reverse('aaa.bbb.cCc'),'.',1))) to get part of string after the last dot in string?
Exactly what I wanted to mention. The string has more than one dot. Here is an example of a string that needs to be compared. /Core/Scripts/jquery.flipCounter.1.2.js
@TomasGreif yes, you're right, but there's an easier fix than that: you can use regex substring(uri_stem from '[^.]+$') to extract all trailing non-dot chars - see edit
@user3455309 my first cut didn't cater for multiple dots, but see edit for how you can use regex substring(uri_stem from '[^.]+$') to extract all trailing non-dot chars, which works for any number of dots (including no dots)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.