Presenting Vertical Data Horizontally
July 10, 2013 Victor Pisman
Presenting “vertical” data in a “horizontal” format is a challenging task in data management. Here is one method you can use to make SQL carry out this type of transformation. Consider table CUSTGRP, which has two fields: a customer ID and a group code. Notice that one customer can have multiple group codes.
Here’s one way to build a result set that has only one record per customer, with multiple fields populated with different group codes. First, determine the number of groups to which each customer belongs: select id, count(*) as count from custgrp group by id
Next, join the table to itself, defining the table once for each column of the resulting repeating group. That is, if you want to allow for four repetitions, define the table four times. In this example, I allow for six group codes per customer. The join is the key to this technique. The common field, in this case customer ID, must be joined on an equal condition, but the repeating group must be joined on either a less than or a greater than condition. select a.id, a.group as x1, b.group as x2, c.group as x3, d.group as x4, e.group as x5, f.group as x6 from custgrp as a left join custgrp as b on a.id = b.id and a.group < b.group left join custgrp as c on b.id = c.id and b.group < c.group left join custgrp as d on c.id = d.id and c.group < d.group left join custgrp as e on d.id = e.id and d.group < e.group left join custgrp as f on e.id = f.id and e.group < f.group
The result set contains multiple rows for each customer. One row per customer contains all of the group codes, and the other rows do not. The only thing remaining to do is to select only those rows with all the group codes for each customer. with summary as ( select id, count(*) as count from custgrp group by id), temp1 as ( select a.id, a.group as x1, b.group as x2, c.group as x3, d.group as x4, e.group as x5, f.group as x6 from custgrp as a left join custgrp as b on a.id = b.id and a.group < b.group left join custgrp as c on b.id = c.id and b.group < c.group left join custgrp as d on c.id = d.id and c.group < d.group left join custgrp as e on d.id = e.id and d.group < e.group left join custgrp as f on e.id = f.id and e.group < f.group order by 1, 2 ) select t.* , s.count from temp1 as t join summary as s on t.id = s.id where s.count = ( case when x1 is not null then 1 else 0 end + case when x2 is not null then 1 else 0 end + case when x3 is not null then 1 else 0 end + case when x4 is not null then 1 else 0 end + case when x5 is not null then 1 else 0 end + case when x6 is not null then 1 else 0 end ) order by 1 The SQL query contains two common table expressions–SUMMARY and TEMP1–to carry out the first two SQL statements. The remainder of the SQL expression combines those common table expressions to produce the final result set. select t.* , s.count from temp1 as t join summary as s on t.id = s.id where s.count = ( case when x1 is not null then 1 else 0 end + case when x2 is not null then 1 else 0 end + case when x3 is not null then 1 else 0 end + case when x4 is not null then 1 else 0 end + case when x5 is not null then 1 else 0 end + case when x6 is not null then 1 else 0 end ) Each case statement returns either zero or one, depending on whether a repeating group value is null or not. The system counts the number of non-null values in the repeating group in the TEMP1 result set by summing the zeros and ones. Only one TEMP1 row of each group matches the count in the SUMMARY result set. Those are the rows that make it into the result set.
Be sure to allow for enough repetitions. In this example, any customers that belonged to seven or more groups would not be included in the result set. RELATED STORIES Treasury of new DB2 6.1 Features, Part 2: Grouping Sets and Super Groups Build Pivot Tables over DB2 Data Load a Spreadsheet from a DB2/400 Database: Part 2 Load a Spreadsheet from a DB2/400 Database Creating Pivot Tables on the iSeries
|