An SQL Pivot Procedure

April 21, 2015 Paul Tuohy

The basic concept of a pivot is that row values are transformed into column values. I have often wished that DB2 had a pivot procedure. But until that day arrives I will have to make do with a workaround that you might find useful too.

The workaround is in the form of a stored procedure that can be used to pivot any column in any table. Before getting to the stored procedure, let’s look at a simple pivot example.

These examples are using the SALES table in the DB2 sample database. To create the DB2 sample database on your system, call the SQL stored procedure CREATE_SQL_SAMPLE, and providing the name of the schema you want to create.

CALL QSYS/CREATE_SQL_SAMPLE('MYSCHEMA')

A Simple Pivot

Sample code 1 shows the select statement and result set for selecting a summary of sales by sales person.

select sales_person, sum(sales) as sales
   from sales 
   group by sales_person
   order by sales_person;

SALES_PERSON   SALES
GOUNOT             50	
LEE                91	
LUCCHESSI          14

Sample code 1: Summary of sales by salesperson.

Pivoting this data would result in the single line shown in sample code 2.

GOUNOT         LEE   LUCCHESSI
     50          91          14

Sample code 2: A simple pivoted result.

Unfortunately, the select statement required to generate that single line result set is quite a bit of work, as shown in sample code 3.

select 
   sum(case when sales_person = 'GOUNOT' then sales end) 
    as GOUNOT,
   sum(case when sales_person = 'LEE' then sales end) 
    as LEE,
   sum(case when sales_person = 'LUCCHESSI' then sales end)
    as LUCCHESSI
   from sales;

Sample code 3: Select statement to generate a simple pivoted result.

We will examine this select statement in more detail in a moment, but let’s look at a slightly more complicated (and more meaningful) example first.

A More Complicated Pivot

Sample code 4 shows the shows an extension of the select statement and result set used in sample code 1. We have added sub-totals by region.

select region, sales_person, sum(sales) as sales
   from sales 
   group by region, sales_person
   order by region, sales_person;

REGION         SALES_PERSON   SALES
Manitoba       GOUNOT             15
Manitoba       LEE                23
Manitoba       LUCCHESSI           3
Ontario-North  GOUNOT              1
Ontario-North  LEE                 8
Ontario-South  GOUNOT             10
Ontario-South  LEE                34
Ontario-South  LUCCHESSI           8
Quebec         GOUNOT             24
Quebec         LEE                26
Quebec         LUCCHESSI          34

Sample code 4: Summary of sales by region and salesperson.

Pivoting this data is really beneficial, as shown in sample code 5.

REGION       GOUNOT         LEE   LUCCHESSI
Manitoba          15          23           3
Ontario-North      1           8           -
Ontario-South     10          34           8
Quebec            24          26          34

Sample code 5: A better example of a pivoted result.

And the required select statement is not that different from the original, as shown in sample code 6.

select 
   region,
   sum(case when sales_person = 'GOUNOT' then sales end) 
    as GOUNOT,
   sum(case when sales_person = 'LEE' then sales end) 
    as LEE,
   sum(case when sales_person = 'LUCCHESSI' then sales end)
    as LUCCHESSI
   from sales
   group by region
   order by region;

Sample code 6: Select statement to generate a better example of a pivoted result.

The Problem Is

If you examine the select statements in sample code 3 and 6, you will see that a column is defined for each of the possible values for the SALES_PERSON column. When the statement is run, the value of the SALES column will be accumulated into the relevant column (GOUNOT, LEE or LUCCHESSI) based on the value of the of the SALES_PERSON column.

But what happens if a fourth sales person comes into play? Yes, we have to change the select statement and add a column for the new salesperson.

Personally, I prefer a maintenance-free solution.

Using a Stored Procedure

The stored procedure DO_PIVOT accepts six parameters:

The name of the schema (library).
The name of the table.
The name of the column to be pivoted.
The name of the column to be aggregated for the pivot.
The aggregate function to be performed (defaults to SUM).
The name of the column to group by (defaults to null, which results in a single line result set).

The following procedure call provides the result set shown in sample code 2.

CALL DO_PIVOT('SQLSTAND',
              'SALES',
              'SALES_PERSON', 
              'SALES', 
              DEFAULT, 
              DEFAULT);

And the following procedure call provides the result set shown in sample code 5.

CALL DO_PIVOT('SQLSTAND',
              'SALES',
              'SALES_PERSON', 
              'SALES', 
              DEFAULT, 
              'REGION');

The DO_PIVOT Procedure

The DO_PIVOT procedure dynamically constructs the required select statement to generate the required result set.

First, the procedure generates a list of the distinct values for the requested pivot column. Then, it constructs the dynamic select statement, adding a column definition for each of the distinct values in the generated list.

Finally, the procedure runs the generated statement and returns the result set.

Sample code 7 shows the code required to create the DO_PIVOT procedure. Just copy and paste, change the schema name, and run the statements.

SET SCHEMA = WHER_YOU_WANT_IT;

CREATE PROCEDURE DO_PIVOT
   (IN FOR_SCHEMA CHARACTER (10) , 
    IN FOR_TABLE CHARACTER (10) , 
    IN PIVOT_COLUMN VARCHAR (250) , 
    IN VALUE_COLUMN VARCHAR (250) , 
    IN AGG_FUNCTION VARCHAR (5) DEFAULT 'SUM' , 
    IN GROUP_COLUMN VARCHAR (250) DEFAULT NULL ) 
    LANGUAGE SQL 
    MODIFIES SQL DATA 
    PROGRAM TYPE SUB 
    CONCURRENT ACCESS RESOLUTION DEFAULT 
    DYNAMIC RESULT SETS 1 
    OLD SAVEPOINT LEVEL COMMIT ON RETURN NO 

BEGIN

   DECLARE SQLCODE INTEGER DEFAULT 0 ;
   DECLARE SQL_STATEMENT VARCHAR ( 5000 ) ;
   DECLARE PIVOT_VALUE VARCHAR ( 20 ) ;
   DECLARE PAD CHAR ( 2 ) DEFAULT ' ' ;

   DECLARE C1 CURSOR FOR D1 ;
   DECLARE C2 CURSOR WITH RETURN FOR D2 ;

   SET SCHEMA = FOR_SCHEMA ;

   -- Get the list of values available for the pivot column
   -- Each value will be a column in the return set
   SET SQL_STATEMENT = 'select distinct ' 
                       || PIVOT_COLUMN  
                       || ' from ' 
                       || FOR_TABLE 
                       || ' order by 1' ;

   PREPARE D1 FROM SQL_STATEMENT ;
   OPEN C1 ;

   -- Construct a dynamic select statement for the pivot
   SET SQL_STATEMENT = 'select ' ;

   -- If requested, add the Group By Column 
   -- to the select clause
   IF GROUP_COLUMN IS NOT NULL THEN
      SET SQL_STATEMENT = SQL_STATEMENT || GROUP_COLUMN ;
      SET PAD = ', ' ;
   END IF ;

   -- For each possible value for the Pivot Column, 
   -- add a case statement to perform the requested 
   -- aggregate function on the Value Column
   FETCH NEXT FROM C1 INTO PIVOT_VALUE ;
   WHILE ( SQLCODE >= 0 AND SQLCODE <> 100 ) DO
      SET SQL_STATEMENT = SQL_STATEMENT 
                          || PAD 
                          || AGG_FUNCTION 
                          || '(CASE WHEN ' 
                          || PIVOT_COLUMN 
                          || ' = ''' 
                          || PIVOT_VALUE 
                          || ''' THEN ' 
                          || VALUE_COLUMN 
                          || '  END) AS ' 
                          || PIVOT_VALUE ;
      SET PAD = ', ' ;
      FETCH NEXT FROM C1 INTO PIVOT_VALUE ;
   END WHILE ;
   CLOSE C1 ;

   -- Specify the table to select from
   SET SQL_STATEMENT = SQL_STATEMENT 
                       || ' from ' 
                       || FOR_TABLE ;

   -- If requested, add the Group By Column
   -- to the select clause
   IF GROUP_COLUMN IS NOT NULL THEN
      SET SQL_STATEMENT = SQL_STATEMENT 
                          || ' group by ' 
                          || GROUP_COLUMN 
                          || ' order by ' 
                          || GROUP_COLUMN;
   END IF ;

   PREPARE D2 FROM SQL_STATEMENT ;
   OPEN C2 ;

END ;

LABEL ON ROUTINE DO_PIVOT 
   ( CHAR(), CHAR(), VARCHAR(), VARCHAR(), VARCHAR(), VARCHAR() )  
   IS 'Perform a General Purpose Pivot';

COMMENT ON PARAMETER ROUTINE DO_PIVOT 
   ( CHAR(), CHAR(), VARCHAR(), VARCHAR(), VARCHAR(), VARCHAR() ) 
   (FOR_SCHEMA IS 'Schema for Table' , 
    FOR_TABLE IS 'For Table' , 
    PIVOT_COLUMN IS 'Name of Column to be Pivoted' , 
    VALUE_COLUMN IS 'Column to be Aggregated for Pivot' , 
    AGG_FUNCTION IS 'Use Aggregate Function' , 
    GROUP_COLUMN IS 'Group on Column' ) ;

Sample code 7: Code to create the DO_PIVOT stored procedure.

Something to Play With

This is a procedure I have found useful on more than one occasion and I hope you find it just as useful. But I really wish there was a pivot function in DB2!

Paul Tuohy is CEO of ComCon, an iSeries consulting company, and is one of the co-founders of System i Developer, which hosts the RPG & DB2 Summit conferences. He is an award-winning speaker who also speaks regularly at COMMON conferences, and is the author of “Re-engineering RPG Legacy Applications,” “The Programmers Guide to iSeries Navigator,” and the self-study course called “iSeries Navigator for Programmers.” Send your questions or comments for Paul to Ted Holt via the IT Jungle Contact page.

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search