SITUATION
I am working on a travel engine site and am writing a complex query to match visitors' search queries with their bookings based on IP Address, Destination and Date so I can work out the conversion ratio later.
PROBLEM
There needs to be multiple conversion ratios based on a parameter (In this case, the utm_source which I extract from a RequestUrl stored in the searches table). The problem is some users make multiple searches from different locations.. sometimes we get the utm_source in the request and sometimes not... and of course we need to match to only 1 booking. See screenshot of query result below to better understand:
See the 3rd and 4th rows have the same booking ID, etc.. but different values for the Value column. I need to select only 1 of these, but not both. Basically, if there is more than 1, I need to choose the 1 that is not "N/A".
MY QUERY:
SELECT DISTINCT "B"."Id" AS "BookingId", "PQ"."IPAddress", "PQ"."To", "PQ"."SearchDate", "PQ"."Value" FROM ( SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value" FROM dbo."PackageQueries" WHERE "SiteId" = '<The ID>' AND "CreatedAt" >= '<Start Date>' AND "CreatedAt" < '<End Date>' ) AS "PQ" INNER JOIN dbo."Bookings" AS "B" ON "PQ"."IPAddress" = "B"."IPAddress" AND "B"."To" = "PQ"."To" AND "B"."BookingDate"::date = "PQ"."SearchDate" WHERE "B"."SiteId" = '<The ID>' AND "B"."BookingStatus" = 2 AND "B"."BookingDate" >= '<Start Date>' AND "B"."BookingDate" < '<End Date>' ORDER BY "B"."Id", "PQ"."IPAddress", "PQ"."To"; 