Guru: Common Table Expressions Can Replace Query Chains
June 5, 2017 Ted Holt
Modernization efforts often concentrate on the database and programs. That is well and good, but there is more to modernization. Replacing Query for IBM i with more modern query tools is also important. But what do you do with those queries that people depend on? More confusing, what do you do with query chains?
A query chain is a series of queries that run one after another, consolidating and reformatting data in temporary physical files, in order to produce a resulting data set, often in report form. It’s not unusual to see small CL programs like the following one:
PGM RUNQRY QRY(SA001Q) RUNQRY QRY(SA002Q) RUNQRY QRY(SA003Q) RUNQRY QRY(SA004Q) OUTTYPE(*PRINTER) ENDPGM
In this example, the first three queries load physical files with summary information. The last query combines all the data and builds a report.
Query | Purpose | Input files | Output file |
SA001Q | Sum last year sales by item | INVHDR, INVLINE | SA001OUT |
SA002Q | Sum last year sales YTD by item | INVHDR, INVLINE | SA002OUT |
SA003Q | Sum current year YTD sales by item | INVHDR, INVLINE | SA003OUT |
SA004Q | Print one day’s invoiced items | INVHDR, INVLINE, SA001OUT, SA002OUT, SA003OUT |
The report looks something like this:
Item Qty Invoice Line Date Cust Last Last Current year year YTD YTD A-1 5 10011 2 20170531 101 0 0 8 A-7 2 10011 3 20170531 101 1 1 6 B-1 3 10011 1 20170531 101 11 9 12
Fortunately, converting such query chains into something more modern, something SQL-based, is not difficult. Here’s one way to approach it.
First, use the Retrieve Query Management Query (RTVQMQRY) command convert each query to SQL. RTVQMQRY reads the query definition and writes comparable SQL to a source physical file.
RTVQMQRY QMQRY(SA001Q) SRCFILE(MYLIB/SQLSRC) + SRCMBR(SA001QSQL) ALWQRYDFN(*YES)
I chose to put the SQL equivalent of query SA001Q into member SA001QSQL of source physical file SQLSRC in library MYLIB. Be sure to specify ALWQRYDFN(*YES), as this is the parameter that makes RTVQMQRY look for a query definition object.
Here’s the generated SQL source code for query SA001Q:
H QM4 05 Q 01 E V W E R 01 03 17/05/31 18:45 V 1001 050 V 5001 004 *HEX SELECT ALL T02.ITEM, SUM(T02.QUANTITY) FROM MYLIB/TINVHDR T01 INNER JOIN MYLIB/TINVLINE T02 ON T01.INVNO = T02.INVNO WHERE T01.INVDATE BETWEEN 20160101 AND 20161231 GROUP BY T02.ITEM ORDER BY T02.ITEM ASC
Ignore the lines that precede the SELECT. Copy and paste the SELECT statement into a new source member as a common table expression, named after the output file. Modify as needed.
with SA001OUT as
(SELECT
ALL T02.ITEM, SUM(T02.QUANTITY) AS QUANTITY01
FROM TINVHDR T01 INNER JOIN
TINVLINE T02
ON T01.INVNO = T02.INVNO
WHERE T01.INVDATE BETWEEN 20160101 AND 20161231
GROUP BY T02.ITEM),
I had to add correlation name QUANTITY01, since that’s what the field was called in the query. I also chose to remove the qualifying library names and to delete the ORDER BY clause, which is not needed.
After converting all four queries, copying, pasting, and tweaking the SQL source, I end up with this:
with SA001OUT as (SELECT ALL T02.ITEM, SUM(T02.QUANTITY) AS QUANTITY01 FROM TINVHDR T01 INNER JOIN TINVLINE T02 ON T01.INVNO = T02.INVNO WHERE T01.INVDATE BETWEEN 20160101 AND 20161231 GROUP BY T02.ITEM), SA002OUT as (SELECT ALL T02.ITEM, SUM(T02.QUANTITY) AS QUANTITY01 FROM TINVHDR T01 INNER JOIN TINVLINE T02 ON T01.INVNO = T02.INVNO WHERE T01.INVDATE BETWEEN 20160101 AND 20160531 GROUP BY T02.ITEM), SA003OUT as (SELECT ALL T02.ITEM, SUM(T02.QUANTITY) AS QUANTITY01 FROM TINVHDR T01 INNER JOIN TINVLINE T02 ON T01.INVNO = T02.INVNO WHERE T01.INVDATE BETWEEN 20170101 AND 20170531 GROUP BY T02.ITEM) SELECT T02.QUANTITY, T01.INVNO, T02.LINENO, T01.INVDATE, T01.CUSTNO, T03.QUANTITY01, T04.QUANTITY01, T05.QUANTITY01 FROM TINVHDR T01 LEFT OUTER JOIN TINVLINE T02 ON T01.INVNO = T02.INVNO LEFT OUTER JOIN SA001OUT T03 ON T02.ITEM = T03.ITEM LEFT OUTER JOIN SA002OUT T04 ON T02.ITEM = T04.ITEM LEFT OUTER JOIN SA003OUT T05 ON T02.ITEM = T05.ITEM WHERE T01.INVDATE = 20170531 ORDER BY T02.ITEM ASC;
The parts in red are modifications. Notice also that I removed the library names, as I am wild about using the library list. In fact, I had to remove the library names from the work files, otherwise the main SELECT (the last one) would have read physical files SA001OUT, SA002OUT, and SA003OUT instead of the common table expressions of the same names.
Where you go from here is up to you. Modify the SQL as you wish. Put your new SQL in a stored procedure. Embed it in an RPG program. Run it in a GUI query tool.
This is not the only way to convert a query chain to SQL, but it’s an easy, straightforward way. It probably won’t perform any worse than the queries it replaced, and now you have something you can work with.
Ted, I prefer a more inline approach using join table. Seems to run much faster.
SELECT T02.QUANTITY, T01.INVNO, T02.LINENO, T01.INVDATE,
T01.CUSTNO, IFNULL(T03.QUANTITY01,0), IFNULL(T04.QUANTITY01),
IFNULL(T05.QUANTITY01)
FROM TINVHDR T01
JOIN TINVLINE T02 ON T01.INVNO = T02.INVNO
JOIN TABLE( SELECT SUM(T02.QUANTITY) AS QUANTITY01
FROM TINVHDR X
JOIN TINVLINE T02 ON T02.INVNO = X.INVNO
WHERE X.INVNO = T01.INVNO
AND X.INVDATE BETWEEN 20160101 AND 20161231
) ON 1=1 AS T03
JOIN TABLE( SELECT SUM(T02.QUANTITY) AS QUANTITY01
FROM TINVHDR X
JOIN TINVLINE T02 ON T02.INVNO = X.INVNO
WHERE X.INVNO = T01.INVNO
AND X.INVDATE BETWEEN 20160101 AND 20160531
) ON 1=1 AS T04
JOIN TABLE( SELECT SUM(T02.QUANTITY) AS QUANTITY01
FROM TINVHDR X
JOIN TINVLINE T02 ON T02.INVNO = X.INVNO
WHERE X.INVNO = T01.INVNO
AND T01.INVDATE BETWEEN 20170101 AND 20170531
) ON 1=1 AS T05
WHERE T01.INVDATE = 20170531
ORDER BY T02.ITEM ASC;
Little typo error on the IFNULL, but you should get the jest of it.
Hi Ted, is it possible to create unique clustered indexes on a view of multiple joined tables?
Indexes are created over tables, Allan. Not views. So, no, I don’t think it is possible.