1

Given a table structure like:

CAT BUSINESS_NR TIME_A VERSION SOME_CODE
ABC 123 2009-02-19T00:00:00 1 OPP
ABC 456 2009-03-18 00:00:00 1 ZUM
ABC 444 2009-03-18 00:00:00 1 ZUM
ABC 444 2009-03-18 00:00:00 2 MUZ
ABC 456 2009-04-18 00:00:00 2 XXX
ABC 456 2009-04-18 00:00:00 3 XXX
ABC 456 2009-04-18 00:00:00 4 UIO
ABC 456 2009-05-18 00:00:00 5 RQA
DEF 637 2018-02-16 00:00:00 1 FAW
DEF 789 2018-02-17 00:00:00 1 WER
SPZ 123 2018-02-16 00:00:00 1 AAA
SPZ 123 2018-02-17 00:00:00 2 BBB
SPZ 123 2018-02-18 00:00:00 3 AAA
SPZ 123 2018-02-19 00:00:00 4 CCC
SPZ 123 2018-02-20 00:00:00 5 AAA
SPZ 123 2018-02-21 00:00:00 6 DDD
SPZ 123 2018-02-22 00:00:00 7 DDD
SPZ 123 2018-02-23 00:00:00 8 DDD
SPZ 123 2018-02-24 00:00:00 9 EEE
SPZ 123 2018-02-25 00:00:00 10 EEE
SPZ 123 2018-02-26 00:00:00 11 DDD
SPZ 123 2018-02-27 00:00:00 12 BBB
SPZ 123 2018-02-26 00:00:00 13 EEE
SPZ 123 2018-02-27 00:00:00 14 EEE
GHI 248 2018-02-17 00:00:00 1 QWE
GHI 248 2019-02-17 00:00:00 2 PPP
GHI 357 2020-02-16 00:00:00 1 FFF
GHI 420 2020-02-16 00:00:00 1 QDS
GHI 357 2020-02-16 00:00:00 2 GGG
GHI 357 2020-02-16 00:00:00 3 LLL
GHI 357 2020-02-16 00:00:00 4 LLL
GHI 357 2020-08-16 00:00:00 4 FFF
GHI 357 2020-10-16 00:00:00 5 ZZZ

... which one can easily create by:

CREATE TABLE blah (CAT, BUSINESS_NR, TIME_A, VERSION, SOME_CODE) AS SELECT 'ABC', 123, TIMESTAMP '2009-02-19 00:00:00 UTC', 1, 'OPP' FROM DUAL UNION ALL SELECT 'ABC', 456, TIMESTAMP '2009-03-18 00:00:00 UTC', 1, 'ZUM' FROM DUAL UNION ALL SELECT 'ABC', 444, TIMESTAMP '2009-03-18 00:00:00 UTC', 1, 'ZUM' FROM DUAL UNION ALL SELECT 'ABC', 444, TIMESTAMP '2009-03-18 00:00:00 UTC', 2, 'MUZ' FROM DUAL UNION ALL SELECT 'ABC', 456, TIMESTAMP '2009-04-18 00:00:00 UTC', 2, 'XXX' FROM DUAL UNION ALL SELECT 'ABC', 456, TIMESTAMP '2009-04-18 00:00:00 UTC', 3, 'XXX' FROM DUAL UNION ALL SELECT 'ABC', 456, TIMESTAMP '2009-04-18 00:00:00 UTC', 4, 'UIO' FROM DUAL UNION ALL SELECT 'ABC', 456, TIMESTAMP '2009-05-18 00:00:00 UTC', 5, 'RQA' FROM DUAL UNION ALL SELECT 'DEF', 637, TIMESTAMP '2018-02-16 00:00:00 UTC', 1, 'FAW' FROM DUAL UNION ALL SELECT 'DEF', 789, TIMESTAMP '2018-02-17 00:00:00 UTC', 1, 'WER' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-16 00:00:00 UTC', 1, 'AAA' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-17 00:00:00 UTC', 2, 'BBB' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-18 00:00:00 UTC', 3, 'AAA' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-19 00:00:00 UTC', 4, 'CCC' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-20 00:00:00 UTC', 5, 'AAA' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-21 00:00:00 UTC', 6, 'DDD' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-22 00:00:00 UTC', 7, 'DDD' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-23 00:00:00 UTC', 8, 'DDD' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-24 00:00:00 UTC', 9, 'EEE' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-25 00:00:00 UTC', 10, 'EEE' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-26 00:00:00 UTC', 11, 'DDD' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-27 00:00:00 UTC', 12, 'BBB' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-26 00:00:00 UTC', 13, 'EEE' FROM DUAL UNION ALL SELECT 'SPZ', 123, TIMESTAMP '2018-02-27 00:00:00 UTC', 14, 'EEE' FROM DUAL UNION ALL SELECT 'GHI', 248, TIMESTAMP '2018-02-17 00:00:00 UTC', 1, 'QWE' FROM DUAL UNION ALL SELECT 'GHI', 248, TIMESTAMP '2019-02-17 00:00:00 UTC', 2, 'PPP' FROM DUAL UNION ALL SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 1, 'FFF' FROM DUAL UNION ALL SELECT 'GHI', 420, TIMESTAMP '2020-02-16 00:00:00 UTC', 1, 'QDS' FROM DUAL UNION ALL SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 2, 'GGG' FROM DUAL UNION ALL SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 3, 'LLL' FROM DUAL UNION ALL SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 4, 'LLL' FROM DUAL UNION ALL SELECT 'GHI', 357, TIMESTAMP '2020-08-16 00:00:00 UTC', 4, 'FFF' FROM DUAL UNION ALL SELECT 'GHI', 357, TIMESTAMP '2020-10-16 00:00:00 UTC', 5, 'ZZZ' FROM DUAL 

... I want to count each change of SOME_CODE for each CAT per month. A change means a change of SOME_CODE within a Business_nr.

I started with the following approach:

select cat, moenat, sum(counta) from ( select cat, trunc(TIME_A, 'MON') as moenat, count(distinct SOME_CODE) as counta from blah group by cat, BUSINESS_NR, trunc(TIME_A, 'MON') ) z group by cat, moenat order by cat, moenat 

... which outputs almost what I want:

CAT MONTH sum(counta)
SPZ FEB 2018 5
... ... ...

But I need all real changes AND the condition: filter out duplicates that are coming one after the other, like I can demonstrate by CAT='SPZ':

CAT BUSINESS_NR TIME_A VERSION SOME_CODE
SPZ 123 2018-02-21 00:00:00 6 DDD
SPZ 123 2018-02-22 00:00:00 7 DDD
SPZ 123 2018-02-23 00:00:00 8 DDD

and so on.

So for SPZ: versions 6 to 8, 9-10 and 13-14 should count each as one (not three or two times) because there was no effective change going on.

That means the expected query output (in FEB 2018) for CAT='SPZ' therefore must be like:

CAT MONTH sum(counta)
SPZ FEB 2018 10
... ... ...

... for 10 times there was an effective change of the code (starting value with version 1 'AAA' must count too, because each record indicates a change).

FYI: maybe parts of the solution (MATCH_RECOGNIZE) of this topic could help: SQL query for counting changes of values in a column but I am not sure.

Hint: order is defined by higher version number as well as time_a information.

Thanks for any new appoach.

2
  • Yes, MATCH_RECOGNIZE might work well; see asktom.oracle.com/pls/apex/… Commented Mar 11, 2021 at 14:28
  • Your example for SPZ is clear, but the business nr is same for all records. What output would you want to see for GHI ? If there is only 1 row, for a business nr/month combo, does that count as a change ? Do you look within a business nr but across months ? Commented Mar 11, 2021 at 14:39

1 Answer 1

2

You can use the LAG function to look at the previous version to see if some_code matches the current version, then SUM to get the total of all the code changes.

 SELECT cat, moenat, SUM (version_change) AS total_version_changes FROM (SELECT cat, TRUNC (TIME_A, 'MON') AS moenat, CASE WHEN some_code <> LAG (some_code) OVER (PARTITION BY cat ORDER BY version, time_a) OR LAG (some_code) OVER (PARTITION BY cat ORDER BY version, time_a) IS NULL --Needed for version 1 THEN 1 ELSE 0 END AS version_change FROM blah) GROUP BY cat, moenat ORDER BY cat, moenat; CAT MOENAT TOTAL_VERSION_CHANGES ______ ____________ ________________________ ABC 01-FEB-09 1 ABC 01-MAR-09 2 ABC 01-APR-09 2 ABC 01-MAY-09 1 DEF 01-FEB-18 2 GHI 01-FEB-18 1 GHI 01-FEB-19 1 GHI 01-FEB-20 4 GHI 01-AUG-20 1 GHI 01-OCT-20 1 SPZ 01-FEB-18 10 
Sign up to request clarification or add additional context in comments.

2 Comments

seems to be almost fully correct. Only thing I am thinking about is the case: GHI 01-FEB-20 5... here I would expect 4 because LLL shall be counted as duplicate.... strange because it seems to work for: SPZ 123 2018-02-27 00:00:00 14 EEE
@DerBenniAusA - You are correct. I have modified the window clauses of the LAGs to correct the issue. Please look at the updated answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.