Grouping a Union
March 19, 2008 Hey, Ted
We have several sales history files–one for each year. When I need to combine data for more than one year, I have to use an SQL UNION. I am trying to group “unioned” data and can’t seem to get it right. Is it possible to group a dataset that is built by a union? –Tom This appears to be an easy problem to resolve, but appearances can be deceiving. If you’re not careful, you can get the wrong results when you UNION two or more tables, especially when you’re summarizing data. Divide the query into two steps: one to union the data, the other to group it. Assuming you have two files, one for the current year and one for the previous year, here’s one possibility. WITH Sales AS (SELECT Customer, Item, QtyShipped, UnitPrice FROM CurrYear UNION ALL SELECT Customer, Item, QtyShipped, UnitPrice FROM PrevYear ) SELECT Customer, Item, SUM(QtyShipped), SUM(QtyShipped * UnitPrice) FROM Sales GROUP by Customer, Item ORDER BY Customer, Item; The WITH defines a common table expression, which creates a temporary table of the combined data. This temporary table is called SALES. Notice I used UNION ALL, not UNION, so duplicate rows would not be omitted. The last SELECT summarizes the combined data on customer and item number. Here’s another query that gives the same results. It summarizes each file, then summarizes the two summaries. I doubt this method is any better than the first one. WITH SalesSummary AS (SELECT Customer, Item, SUM(QtyShipped) AS qty, SUM(QtyShipped * UnitPrice) AS sale FROM CurrYear GROUP by Customer, Item UNION ALL SELECT Customer, Item, SUM(QtyShipped) AS qty, SUM(QtyShipped * UnitPrice) AS sale FROM PrevYear GROUP by Customer, Item) SELECT Customer, Item, SUM(qty), SUM(sale) FROM SalesSummary GROUP by Customer, Item ORDER BY Customer, Item; I googled and found a similar question at http://joecelkothesqlapprentice.blogspot.com/. I based the following SQL query on Joe Celko’s answer. SELECT sale.Customer, sale.Item, SUM(sale.QtyShipped), SUM(sale.QtyShipped * sale.UnitPrice) FROM (SELECT Customer, Item, QtyShipped, UnitPrice FROM CurrYear UNION ALL SELECT Customer, Item, QtyShipped, UnitPrice FROM PrevYear) AS sale GROUP by Customer, Item ORDER BY Customer, Item; This query differs from the first example in that the union has been embedded in the FROM clause. BTW, in an ideal world, all sales history would be stored in one big file and transactions would be extracted by date. I find it more practical to keep separate files of history. It appears to me that many other people think the same. –Ted RELATED STORIES Time for a Common Table Expression, Part 2
|