0

I have defined function with table value parameter which returns median as per values passed to it. Function defined as:

 CREATE FUNCTION [dbo].[fn_GetMedian](@List TypeMedian READONLY) RETURNS INT AS BEGIN <function body> END 

And Table Type TypeMedian definition as:

CREATE TYPE [dbo].[TypeMedian] AS TABLE( [VALUE] [int] NULL ) 

Now I have a table Listing with filled values in it as and a table RESULT to be filled according to table Listing:

Tables structure as

LISTING(ListingCol1,ListingCol2,ListingCol3,ListingCol4,ListingCol5) RESULT(Col1,Col2,Col3,Col4,Col5) 

Listing table has more that 1000 rows of data.

All column from both tables are of type int. Now I want to fill columns of RESULT table and that column could be calculate as:

Col1 = SUM(ListingCol1) Col2 = SUM(ListingCol2) Col3 = dbo.fn_GetMedian(ListingCol3) Col4 = dbo.fn_GetMedian(ListingCol4) Col5 = dbo.fn_GetMedain(ListingCol5) 

And I'm doing so as:

INSERT INTO RESULT(Col1) SELECT SUM(ListingCol1) Update RESULT SET Col2 = SUM(ListingCol2) DECLARE @tbl_Median TypeMedian INSERT INTO @tbl_Median SELECT ListingCol3 FROM Listing UPDATE RESULT SET Col3 = dbo.fn_GetMedian(@tbl_Median) -- For next column DELETE FROM @tbl_Median INSERT INTO @tbl_Median SELECT ListingCol4 FROM Listing UPDATE RESULT SET Col4 = dbo.fn_GetMedian(@tbl_Median); 

--And this update query I repeating for remaining columns. How could I do that in single query?

0

2 Answers 2

3

For two sums and three medians, on a single table, I honestly can't see the benefit of using a complicated dynamic or function-based solution.

It is quite easy to construct a single query using Peter Larsson's median method that I showed you before:

CREATE TABLE dbo.Listing ( ListingCol1 integer NULL, ListingCol2 integer NULL, ListingCol3 integer NULL, ListingCol4 integer NULL, ListingCol5 integer NULL ); CREATE TABLE dbo.Result ( Col1 integer NULL, Col2 integer NULL, Col3 integer NULL, Col4 integer NULL, Col5 integer NULL ); -- Just to show indexes are helpful for the median calculations CREATE INDEX i ON dbo.Listing (ListingCol3) CREATE INDEX j ON dbo.Listing (ListingCol4) CREATE INDEX k ON dbo.Listing (ListingCol5) 

Solution

INSERT dbo.Result ( Col1, Col2, Col3, Col4, Col5 ) SELECT SC.SumCol1, SC.SumCol2, SQ3.MedianCol3, SQ4.MedianCol4, SQ5.MedianCol5 FROM ( -- Sums + counts needed for the median calculations SELECT SumCol1 = SUM(L.ListingCol1), SumCol2 = SUM(L.ListingCol2), CountCol3 = COUNT_BIG(L.ListingCol3), CountCol4 = COUNT_BIG(L.ListingCol4), CountCol5 = COUNT_BIG(L.ListingCol5) FROM dbo.Listing AS L ) AS SC CROSS APPLY ( -- Median for column 3 SELECT MedianCol3 = AVG(1.0 * SQ.ListingCol3) FROM ( SELECT L3.ListingCol3 FROM dbo.Listing AS L3 WHERE L3.ListingCol3 IS NOT NULL ORDER BY L3.ListingCol3 ASC OFFSET (SC.CountCol3 - 1) / 2 ROWS FETCH NEXT 1 + (1 - SC.CountCol3 % 2) ROWS ONLY ) AS SQ ) AS SQ3 CROSS APPLY ( -- Median for column 4 SELECT MedianCol4 = AVG(1.0 * SQ.ListingCol4) FROM ( SELECT L4.ListingCol4 FROM dbo.Listing AS L4 WHERE L4.ListingCol4 IS NOT NULL ORDER BY L4.ListingCol4 ASC OFFSET (SC.CountCol4 - 1) / 2 ROWS FETCH NEXT 1 + (1 - SC.CountCol4 % 2) ROWS ONLY ) AS SQ ) AS SQ4 CROSS APPLY ( -- Median for column 5 SELECT MedianCol5 = AVG(1.0 * SQ.ListingCol5) FROM ( SELECT L5.ListingCol5 FROM dbo.Listing AS L5 WHERE L5.ListingCol5 IS NOT NULL ORDER BY L5.ListingCol5 ASC OFFSET (SC.CountCol5 - 1) / 2 ROWS FETCH NEXT 1 + (1 - SC.CountCol5 % 2) ROWS ONLY ) AS SQ ) AS SQ5; 

Expected execution plan shape:

Plan

2

And this update query I repeating for remaining columns. How could I do that in single query?

Performance implications aside for a moment, the easiest way to accomplish this in a single query** is by creating a User-Defined Aggregate (UDA) via SQLCLR. This is what I described in my answer to your related question:

How can I pass column to function in sql?

If you had such a function, you could do the following:

INSERT INTO RESULT(Col1, Col2, Col3, Col4, Col5) SELECT SUM(ListingCol1) AS [Col1], SUM(ListingCol2) AS [Col2], dbo.AggMedian(ListingCol3) AS [Col3], dbo.AggMedian(ListingCol4) AS [Col4], dbo.AggMedian(ListingCol5) AS [Col5] FROM LISTING; 

With that said, performance is still something that needs to be considered. The approach shown above does not work in all situations. If the LISTING table has millions of rows (at least millions per each grouping), then this might not work. But if each grouping has 5000 rows, or maybe even 10,000 or something along those lines, and the process doesn't run multiple times per minute, then you should be fine. Of course, as with anything, it should be tested against your actual data to determine if there is a performance issue or not.

** By "easiest way in a single query", I am assuming that the desired method is one that can be easily applied in multiple situations, especially ad hoc queries.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.