An SQL Pivot Procedure
April 21, 2015 Paul Tuohy
The basic concept of a pivot is that row values are transformed into column values. I have often wished that DB2 had a pivot procedure. But until that day arrives I will have to make do with a workaround that you might find useful too. The workaround is in the form of a stored procedure that can be used to pivot any column in any table. Before getting to the stored procedure, let’s look at a simple pivot example. These examples are using the SALES table in the DB2 sample database. To create the DB2 sample database on your system, call the SQL stored procedure CREATE_SQL_SAMPLE, and providing the name of the schema you want to create. CALL QSYS/CREATE_SQL_SAMPLE('MYSCHEMA') A Simple Pivot Sample code 1 shows the select statement and result set for selecting a summary of sales by sales person. select sales_person, sum(sales) as sales from sales group by sales_person order by sales_person; SALES_PERSON SALES GOUNOT 50 LEE 91 LUCCHESSI 14 Sample code 1: Summary of sales by salesperson. Pivoting this data would result in the single line shown in sample code 2. GOUNOT LEE LUCCHESSI 50 91 14 Sample code 2: A simple pivoted result. Unfortunately, the select statement required to generate that single line result set is quite a bit of work, as shown in sample code 3. select sum(case when sales_person = 'GOUNOT' then sales end) as GOUNOT, sum(case when sales_person = 'LEE' then sales end) as LEE, sum(case when sales_person = 'LUCCHESSI' then sales end) as LUCCHESSI from sales; Sample code 3: Select statement to generate a simple pivoted result. We will examine this select statement in more detail in a moment, but let’s look at a slightly more complicated (and more meaningful) example first. A More Complicated Pivot Sample code 4 shows the shows an extension of the select statement and result set used in sample code 1. We have added sub-totals by region. select region, sales_person, sum(sales) as sales from sales group by region, sales_person order by region, sales_person; REGION SALES_PERSON SALES Manitoba GOUNOT 15 Manitoba LEE 23 Manitoba LUCCHESSI 3 Ontario-North GOUNOT 1 Ontario-North LEE 8 Ontario-South GOUNOT 10 Ontario-South LEE 34 Ontario-South LUCCHESSI 8 Quebec GOUNOT 24 Quebec LEE 26 Quebec LUCCHESSI 34 Sample code 4: Summary of sales by region and salesperson. Pivoting this data is really beneficial, as shown in sample code 5. REGION GOUNOT LEE LUCCHESSI Manitoba 15 23 3 Ontario-North 1 8 - Ontario-South 10 34 8 Quebec 24 26 34 Sample code 5: A better example of a pivoted result. And the required select statement is not that different from the original, as shown in sample code 6. select region, sum(case when sales_person = 'GOUNOT' then sales end) as GOUNOT, sum(case when sales_person = 'LEE' then sales end) as LEE, sum(case when sales_person = 'LUCCHESSI' then sales end) as LUCCHESSI from sales group by region order by region; Sample code 6: Select statement to generate a better example of a pivoted result. The Problem Is If you examine the select statements in sample code 3 and 6, you will see that a column is defined for each of the possible values for the SALES_PERSON column. When the statement is run, the value of the SALES column will be accumulated into the relevant column (GOUNOT, LEE or LUCCHESSI) based on the value of the of the SALES_PERSON column. But what happens if a fourth sales person comes into play? Yes, we have to change the select statement and add a column for the new salesperson. Personally, I prefer a maintenance-free solution. Using a Stored Procedure The stored procedure DO_PIVOT accepts six parameters:
The following procedure call provides the result set shown in sample code 2. CALL DO_PIVOT('SQLSTAND', 'SALES', 'SALES_PERSON', 'SALES', DEFAULT, DEFAULT); And the following procedure call provides the result set shown in sample code 5. CALL DO_PIVOT('SQLSTAND', 'SALES', 'SALES_PERSON', 'SALES', DEFAULT, 'REGION'); The DO_PIVOT Procedure The DO_PIVOT procedure dynamically constructs the required select statement to generate the required result set. First, the procedure generates a list of the distinct values for the requested pivot column. Then, it constructs the dynamic select statement, adding a column definition for each of the distinct values in the generated list. Finally, the procedure runs the generated statement and returns the result set. Sample code 7 shows the code required to create the DO_PIVOT procedure. Just copy and paste, change the schema name, and run the statements. SET SCHEMA = WHER_YOU_WANT_IT; CREATE PROCEDURE DO_PIVOT (IN FOR_SCHEMA CHARACTER (10) , IN FOR_TABLE CHARACTER (10) , IN PIVOT_COLUMN VARCHAR (250) , IN VALUE_COLUMN VARCHAR (250) , IN AGG_FUNCTION VARCHAR (5) DEFAULT 'SUM' , IN GROUP_COLUMN VARCHAR (250) DEFAULT NULL ) LANGUAGE SQL MODIFIES SQL DATA PROGRAM TYPE SUB CONCURRENT ACCESS RESOLUTION DEFAULT DYNAMIC RESULT SETS 1 OLD SAVEPOINT LEVEL COMMIT ON RETURN NO BEGIN DECLARE SQLCODE INTEGER DEFAULT 0 ; DECLARE SQL_STATEMENT VARCHAR ( 5000 ) ; DECLARE PIVOT_VALUE VARCHAR ( 20 ) ; DECLARE PAD CHAR ( 2 ) DEFAULT ' ' ; DECLARE C1 CURSOR FOR D1 ; DECLARE C2 CURSOR WITH RETURN FOR D2 ; SET SCHEMA = FOR_SCHEMA ; -- Get the list of values available for the pivot column -- Each value will be a column in the return set SET SQL_STATEMENT = 'select distinct ' || PIVOT_COLUMN || ' from ' || FOR_TABLE || ' order by 1' ; PREPARE D1 FROM SQL_STATEMENT ; OPEN C1 ; -- Construct a dynamic select statement for the pivot SET SQL_STATEMENT = 'select ' ; -- If requested, add the Group By Column -- to the select clause IF GROUP_COLUMN IS NOT NULL THEN SET SQL_STATEMENT = SQL_STATEMENT || GROUP_COLUMN ; SET PAD = ', ' ; END IF ; -- For each possible value for the Pivot Column, -- add a case statement to perform the requested -- aggregate function on the Value Column FETCH NEXT FROM C1 INTO PIVOT_VALUE ; WHILE ( SQLCODE >= 0 AND SQLCODE <> 100 ) DO SET SQL_STATEMENT = SQL_STATEMENT || PAD || AGG_FUNCTION || '(CASE WHEN ' || PIVOT_COLUMN || ' = ''' || PIVOT_VALUE || ''' THEN ' || VALUE_COLUMN || ' END) AS ' || PIVOT_VALUE ; SET PAD = ', ' ; FETCH NEXT FROM C1 INTO PIVOT_VALUE ; END WHILE ; CLOSE C1 ; -- Specify the table to select from SET SQL_STATEMENT = SQL_STATEMENT || ' from ' || FOR_TABLE ; -- If requested, add the Group By Column -- to the select clause IF GROUP_COLUMN IS NOT NULL THEN SET SQL_STATEMENT = SQL_STATEMENT || ' group by ' || GROUP_COLUMN || ' order by ' || GROUP_COLUMN; END IF ; PREPARE D2 FROM SQL_STATEMENT ; OPEN C2 ; END ; LABEL ON ROUTINE DO_PIVOT ( CHAR(), CHAR(), VARCHAR(), VARCHAR(), VARCHAR(), VARCHAR() ) IS 'Perform a General Purpose Pivot'; COMMENT ON PARAMETER ROUTINE DO_PIVOT ( CHAR(), CHAR(), VARCHAR(), VARCHAR(), VARCHAR(), VARCHAR() ) (FOR_SCHEMA IS 'Schema for Table' , FOR_TABLE IS 'For Table' , PIVOT_COLUMN IS 'Name of Column to be Pivoted' , VALUE_COLUMN IS 'Column to be Aggregated for Pivot' , AGG_FUNCTION IS 'Use Aggregate Function' , GROUP_COLUMN IS 'Group on Column' ) ; Sample code 7: Code to create the DO_PIVOT stored procedure. Something to Play With This is a procedure I have found useful on more than one occasion and I hope you find it just as useful. But I really wish there was a pivot function in DB2! Paul Tuohy is CEO of ComCon, an iSeries consulting company, and is one of the co-founders of System i Developer, which hosts the RPG & DB2 Summit conferences. He is an award-winning speaker who also speaks regularly at COMMON conferences, and is the author of “Re-engineering RPG Legacy Applications,” “The Programmers Guide to iSeries Navigator,” and the self-study course called “iSeries Navigator for Programmers.” Send your questions or comments for Paul to Ted Holt via the IT Jungle Contact page. RELATED STORIES Creating Pivot Tables on the iSeries
|
[…] Googolando un po’ ho trovato invece una fantastica stored procedure su ITJungle che mi era sfuggita: DO_PIVOT di Paul Tuohy […]