6

I have a table that outputs similar to this (although in thousands):

 EMPNO ENAME TRANDATE AMT ---------- ---------- --------- ------- 100 Alison 21-MAR-96 45000 100 Alison 12-DEC-78 23000 100 Alison 24-OCT-82 11000 101 Linda 15-JAN-84 16000 101 Linda 30-JUL-87 17000 102 Celia 31-DEC-90 78000 102 Celia 17-SEP-96 21000 103 James 21-MAR-96 45000 103 James 12-DEC-78 23000 103 James 24-OCT-82 11000 104 Robert 15-JAN-84 16000 104 Robert 30-JUL-87 17000 

My desired output would be similar to this:

 EMPNO ENAME TRANDATE AMT PAGE ---------- ---------- --------- ------- ---- 100 Alison 21-MAR-96 45000 1 100 Alison 12-DEC-78 23000 1 100 Alison 24-OCT-82 11000 1 101 Linda 15-JAN-84 16000 2 101 Linda 30-JUL-87 17000 2 102 Celia 31-DEC-90 78000 2 102 Celia 17-SEP-96 21000 2 103 James 21-MAR-96 45000 3 104 Robert 12-DEC-78 23000 4 104 Robert 24-OCT-82 11000 4 104 Robert 15-JAN-84 16000 4 104 Robert 30-JUL-87 17000 4 

Basically, it should insert a new field to identify the page it belongs to. The page break is based on the rows. And, as if "kept together" in EMPNO, it adds 1 to PAGE when the rows cannot add the next EMPNO batch. It's for the Excel limit since Excel does not allow more than 65000 rows (or so) in a single Sheet. In the sample's case, it's only 4 rows. The limit number is static.

4
  • So, can you guarantee that no EMPNO will have > 65000 records? Commented Aug 9, 2010 at 11:09
  • Also, which version of Excel are you using? Excel 2007 permits an insane 1,048,576 rows per worksheet. Commented Aug 9, 2010 at 11:13
  • Well, I was hoping it would support even the Excel versions prior to 2007. (office.microsoft.com/en-us/excel-help/…) Commented Aug 10, 2010 at 4:56
  • In my current scenario (and in my opinion), a single EMPNO most likely would not have more than 65k records. Gary's solution worked for my problem. Many thanks to all of you guys! Commented Aug 10, 2010 at 5:24

4 Answers 4

2

ThinkJet is right that that some of the other answers don't cater for the 'keep together' requirement. However I think this can be done without resorting to a user-defined aggregate.

Sample data

create table test (empno number, ename varchar2(20), trandate date, amt number); insert into test values (100, 'Alison' , to_date('21-MAR-1996') , 45000); insert into test values (100, 'Alison' , to_date('12-DEC-1978') , 23000); insert into test values (100, 'Alison' , to_date('24-OCT-1982') , 11000); insert into test values (101, 'Linda' , to_date('15-JAN-1984') , 16000); insert into test values (101, 'Linda' , to_date('30-JUL-1987') , 17000); insert into test values (102, 'Celia' , to_date('31-DEC-1990') , 78000); insert into test values (102, 'Celia' , to_date('17-SEP-1996') , 21000); insert into test values (103, 'James' , to_date('21-MAR-1996') , 45000); insert into test values (103, 'James' , to_date('12-DEC-1978') , 23000); insert into test values (103, 'James' , to_date('24-OCT-1982') , 11000); insert into test values (104, 'Robert' , to_date('15-JAN-1984') , 16000); insert into test values (104, 'Robert' , to_date('30-JUL-1987') , 17000); 

Now, determine the end row of each empno segment (using RANK to find the start and COUNT..PARTITION BY to find the number in the segment).

Then use ceil/4 from APC's solution to group them into their 'pages'. Again, as pointed out by ThinkJet, there is a problem in the specification as it doesn't cater for the situation when there are more records in the empno 'keep together' segment than can fit in a page.

select empno, ename, ceil((rank() over (order by empno) + count(1) over (partition by empno))/6) as chunk from test order by 1; 

As pointed out by ThinkJet, this solution isn't bullet proof.

drop table test purge; create table test (empno number, ename varchar2(20), trandate date, amt number); declare cursor csr_name is select rownum emp_id, decode(rownum,1,'Alan',2,'Brian',3,'Clare',4,'David',5,'Edgar', 6,'Fred',7,'Greg',8,'Harry',9,'Imran',10,'John', 11,'Kevin',12,'Lewis',13,'Morris',14,'Nigel',15,'Oliver', 16,'Peter',17,'Quentin',18,'Richard',19,'Simon',20,'Terry', 21,'Uther',22,'Victor',23,'Wally',24,'Xander', 25,'Yasmin',26,'Zac') emp_name from dual connect by level <= 26; begin for c_name in csr_name loop for i in 1..11 loop insert into test values (c_name.emp_id, c_name.emp_name, (date '2010-01-01') + i, to_char(sysdate,'SS') * 1000); end loop; end loop; end; / select chunk, count(*) from (select empno, ename, ceil((rank() over (order by empno) + count(1) over (partition by empno))/25) as chunk from test) group by chunk order by chunk ; 

So with chunk size of 25 and group size of 11, we get the jumps where it fits 33 people in the chunk despite the 25 limit. Large chunk sizes and small groups should make this infrequent, but you'd want to allow some leeway. So maybe set the chunks to 65,000 rather than going all the way to 65,536.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks so much! This works for my case! And I really thought the query would be longer. I understand about the problem of the 65k+ records with the same EMPNO, but in my case, it would most likely be not happening.
There are same error as in other answers: random gaps at the end of page not handled in calculation. E.g. suppose you have 3 groups of rows (A,B,C) with 6 rows in each group. Try to place it on pages, 6 row per page: Group A - rank() = 1 - Ok, 1st page; Group B - rank() = 7 - Ok, 2nd page filled from top and 4 rows left blank on 1st page; Group C - rank() = 13 - ERR!, seems to fit to 2nd page but must be placed on 3rd, because REAL rank() value must include empty rows from 1st page: 6+4+6+1=17. No way to predict number of empty rows before real page construction.
This can happen even for 65k+ rows, even at the end of 2nd page: E.g. 5 free rows on 1st page + 4 free rows on 2nd page + next block have size from 5 to 9 rows ...
Ah, I get ThinkJet's issue. There is room for 5 rows at the end of page 1, but the next group is six rows. Those six rows get pushed to page 2 correctly, but those six rows aren't correctly counted as being part of page 2. So when it comes to working out how many can fit on page 2, it may get it wrong.
I think I kinda get the issue. I was just too excited and quick to check on my tests (with a 50k chunk that correctly output a two-page file). But I did notice the decrease on the chunks when I tested the query with a smaller chunk size (in 300's).
2

It's too tricky or even impossible to do such thing in plain SQL.

But with some limitations problem can be resolved with help of user-defined aggregate functions .

First, create object with ODCIAggregate interface implementation:

create or replace type page_num_agg_type as object ( -- Purpose : Pagination with "leave together" option -- Attributes -- Current page number cur_page_number number, -- Cumulative number of rows per page incremented by blocks cur_page_row_count number, -- Row-by-row counter for detect page overflow while placing single block page_row_counter number, -- Member functions and procedures static function ODCIAggregateInitialize( sctx in out page_num_agg_type ) return number, member function ODCIAggregateIterate( self in out page_num_agg_type, value in number ) return number, member function ODCIAggregateTerminate( self in page_num_agg_type, returnValue out number, flags in number ) return number, member function ODCIAggregateMerge( self in out page_num_agg_type, ctx2 in page_num_agg_type ) return number ); 

Create type body:

create or replace type body PAGE_NUM_AGG_TYPE is -- Member procedures and functions static function ODCIAggregateInitialize( sctx in out page_num_agg_type ) return number is begin sctx := page_num_agg_type(1, 0, 0); return ODCIConst.Success; end; member function ODCIAggregateIterate( self in out page_num_agg_type, value in number ) return number is -- !!! WARNING: HARDCODED !!! RowsPerPage number := 4; begin self.page_row_counter := self.page_row_counter + 1; -- Main operations: determine number of page if(value > 0) then -- First row of new block if(self.cur_page_row_count + value > RowsPerPage) then -- If we reach next page with new block of records - switch to next page. self.cur_page_number := self.cur_page_number + 1; self.cur_page_row_count := value; self.page_row_counter := 1; else -- Just increment rows and continue to place on current page self.cur_page_row_count := self.cur_page_row_count + value; end if; else -- Row from previous block if(self.page_row_counter > RowsPerPage) then -- Single block of rows exceeds page size - wrap to next page. self.cur_page_number := self.cur_page_number + 1; self.cur_page_row_count := self.cur_page_row_count - RowsPerPage; self.page_row_counter := 1; end if; end if; return ODCIConst.Success; end; member function ODCIAggregateTerminate( self in page_num_agg_type, returnValue out number, flags in number ) return number is begin -- Returns current page number as result returnValue := self.cur_page_number; return ODCIConst.Success; end; member function ODCIAggregateMerge( self in out page_num_agg_type, ctx2 in page_num_agg_type ) return number is begin -- Can't act in parallel - error on merging attempts raise_application_error(-20202,'PAGE_NUM_AGG_TYPE can''t act in parallel mode'); return ODCIConst.Success; end; end; 

Create agrreation function to use with type:

create function page_num_agg ( input number ) return number aggregate using page_num_agg_type; 

Next prepare data and use new function to calculate page numbers:

with data_list as ( -- Your example data as source select 100 as EmpNo, 'Alison' as EmpName, to_date('21-MAR-96','dd-mon-yy') as TranDate, 45000 as AMT from dual union all select 100 as EmpNo, 'Alison' as EmpName, to_date('12-DEC-78','dd-mon-yy') as TranDate, 23000 as AMT from dual union all select 100 as EmpNo, 'Alison' as EmpName, to_date('24-OCT-82','dd-mon-yy') as TranDate, 11000 as AMT from dual union all select 101 as EmpNo, 'Linda' as EmpName, to_date('15-JAN-84','dd-mon-yy') as TranDate, 16000 as AMT from dual union all select 101 as EmpNo, 'Linda' as EmpName, to_date('30-JUL-87','dd-mon-yy') as TranDate, 17000 as AMT from dual union all select 102 as EmpNo, 'Celia' as EmpName, to_date('31-DEC-90','dd-mon-yy') as TranDate, 78000 as AMT from dual union all select 102 as EmpNo, 'Celia' as EmpName, to_date('17-SEP-96','dd-mon-yy') as TranDate, 21000 as AMT from dual union all select 103 as EmpNo, 'James' as EmpName, to_date('21-MAR-96','dd-mon-yy') as TranDate, 45000 as AMT from dual union all select 103 as EmpNo, 'James' as EmpName, to_date('12-DEC-78','dd-mon-yy') as TranDate, 23000 as AMT from dual union all select 103 as EmpNo, 'James' as EmpName, to_date('24-OCT-82','dd-mon-yy') as TranDate, 11000 as AMT from dual union all select 104 as EmpNo, 'Robert' as EmpName, to_date('15-JAN-84','dd-mon-yy') as TranDate, 16000 as AMT from dual union all select 104 as EmpNo, 'Robert' as EmpName, to_date('30-JUL-87','dd-mon-yy') as TranDate, 17000 as AMT from dual union all select 105 as EmpNo, 'Monica' as EmpName, to_date('30-JUL-88','dd-mon-yy') as TranDate, 31000 as AMT from dual union all select 105 as EmpNo, 'Monica' as EmpName, to_date('01-JUL-87','dd-mon-yy') as TranDate, 19000 as AMT from dual union all select 105 as EmpNo, 'Monica' as EmpName, to_date('31-JAN-97','dd-mon-yy') as TranDate, 11000 as AMT from dual union all select 105 as EmpNo, 'Monica' as EmpName, to_date('17-DEC-93','dd-mon-yy') as TranDate, 33000 as AMT from dual union all select 105 as EmpNo, 'Monica' as EmpName, to_date('11-DEC-91','dd-mon-yy') as TranDate, 65000 as AMT from dual union all select 105 as EmpNo, 'Monica' as EmpName, to_date('22-OCT-89','dd-mon-yy') as TranDate, 19000 as AMT from dual ), ordered_data as ( select -- Source table fields src_data.EmpNo, src_data.EmpName, src_data.TranDate, src_data.AMT, -- Calculate row count per one employee count(src_data.EmpNo) over(partition by src_data.EmpNo)as emp_row_count, -- Calculate rank of row inside employee data sorted in output order rank() over(partition by src_data.EmpNo order by src_data.EmpName, src_data.TranDate) as emp_rnk from data_list src_data ) -- Final step: calculate page number for rows select -- Source table data ordered_data.EmpNo, ordered_data.EmpName, ordered_data.TranDate, ordered_data.AMT, -- Aggregate all data with our new function page_num_agg( -- pass count of rows to aggregate function only for first employee's row decode(ordered_data.emp_rnk, 1, ordered_data.emp_row_count, 0) ) over (order by ordered_data.EmpName, ordered_data.TranDate) as page_number from ordered_data order by ordered_data.EmpName, ordered_data.TranDate 

And, finally ...

Disadvantages of this solution:

  1. Hardcoded page row count.
  2. Requires some specific data preparation in query to use aggregate function properly.

Advantages of this solution:

  1. Just works :)

Updated: improved to handle oversized blocks, example modified.

1 Comment

I haven't really done any user-defined aggregate functions before. I'm allowed to have procedures though.
1

The following SQL statement splits the twenty records in my EMP table into five pages of four rows each:

SQL> select empno 2 , ename 3 , deptno 4 , ceil((row_number() over (order by deptno, empno)/4)) as pageno 5 from emp 6 / EMPNO ENAME DEPTNO PAGENO ---------- ---------- ---------- ---------- 7782 BOEHMER 10 1 7839 SCHNEIDER 10 1 7934 KISHORE 10 1 7369 CLARKE 20 1 7566 ROBERTSON 20 2 7788 RIGBY 20 2 7876 KULASH 20 2 7902 GASPAROTTO 20 2 7499 VAN WIJK 30 3 7521 PADFIELD 30 3 7654 BILLINGTON 30 3 7698 SPENCER 30 3 7844 CAVE 30 4 7900 HALL 30 4 8083 KESTELYN 30 4 8084 LIRA 30 4 8060 VERREYNNE 50 5 8061 FEUERSTEIN 50 5 8085 TRICHLER 50 5 8100 PODER 50 5 20 rows selected. SQL> 

3 Comments

will not work. Important condition - that empno can't be in two groups. In your case empno 101 will be in two separated groups.
Thanks! This code works for me! (Update: Ok... maybe not. Good find, Michael.)
Just not work - wrong results. E.g. where is counted additional empty lines at the end of page for 3rd page and later?
0

How about this one: (100 is the row limit per page)

select a.*, floor(count(1) over (order by empno, empname)/100)+1 as page from source_table a order by page, empno, empname; 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.