More V5R3 SQL Enhancements
July 14, 2004 Michael Sansoterra
My last article briefly covered the new Binary and Varbinary data types, some new scalar functions, and the sequence object that is new to the iSeries implementation of DB2. The good news is there are more SQL enhancements to cover!
DISCLAIMER
Once again, since I don’t yet have access to a machine with V5R3, this information was gleaned from the new V5R3 manuals and may be subject to revision! I’d love to hear from someone who has a beta of V5R3 who can offer more insight on these features.
NAMED COLUMN JOIN
A new shorthand join syntax called a “named column join” is implemented with the USING keyword. For example, the following:
Select * From Order Join OrderLines On OrderLines.Company=Order.Company And OrderLines.OrderID=Order.OrderID
Can be shortened to:
Select * From Order Join OrderLines Using (Company,OrderID)
As evidenced by the syntax, this shorthand can only be used when the tables being joined have the same column names. Outer and exception joins are allowed as well.
LATERAL CORRELATION
Lateral correlation is an enhancement that allows a nested select to reference columns in higher levels of the subquery hierarchy.
Consider the following query, which is supposed to join the order header to the summary of several detail files (order lines, order charges, and order taxes):
Select O.OrderID, O.CustName, OL.LineTotal, OC.OrdChgTotal, OT.TaxTotal From Orders O Left Join (Select OrderID, Sum(NetAmt) As LineTotal From OrderLines Group By OrderID) OL On OL.OrderID=O.OrderID Left Join (Select OrderID, Sum(ChgAmt) As OrdChgTotal From OrderCharges Group By OrderID) OC On OC.OrderID=O.OrderID Left Join (Select OrderID, Sum(TaxAmt) As TaxTotal From OrderTaxes Group By OrderID) OT On OT.OrderID=O.OrderID
Notice that the joins must take place outside of the nested table expression, because a correlation to the Orders table (O) is not allowed within the nested select.
Using lateral correlation, the query can be rewritten as follows:
Select O.OrderID, O.CustName, OL.LineTotal, OC.OrdChgTotal, OT.TaxTotal From Orders O, Lateral (Select Sum(NetAmt) As LineTotal From OrderLines Lines Where Lines.OrderID=O.OrderID) As OL, Lateral (Select Sum(ChgAmt) As OrdChgTotal From OrderCharges Charges Where Lines.OrderID=O.OrderID) As OC, Lateral (Select Sum(TaxAmt) As TaxTotal From OrderTaxes Taxes Where Taxes.OrderID=O.OrderID) As OT
As you can see, the LATERAL keyword allows the nested selects to reference columns in the Orders table (O). This enhancement should improve performance in queries where joins to one or more nested selects is required.
EXCEPT AND INTERSECT KEYWORDS
Similar to UNION DISTINCT and UNION ALL, EXCEPT and INTERSECT are used for combining multiple row sets. Here is the syntax:
Select * From Table1 Except [Distinct] Select * From Table2 Select * From Table1 Intersect [Distinct] Select * From Table2
As with UNIONs, the number of columns in both SELECTs must be the same, and the data types of the columns must be compatible. The EXCEPT keyword is used to process all rows in the first row set (table1) that are not in the second row set (table2). The EXCEPT keyword produces results similar in concept to an Exception Join. Say, for example, you want a list of all open orders that have no shipments against them yet. EXCEPT can be used for this purpose:
Select OrderID, CustName From OpenOrders Except Select OrderID, CustName From Shipments
This query will only return rows from the OpenOrders table that don’t exist in the Shipments table.
The INTERSECT keyword is used to combine row sets only under the condition that the same row exists in both row sets. In the prior example changing the EXCEPT keyword to INTERSECT will only pull open orders that have at least one shipment against them:
Select OrderID, CustName From OpenOrders Intersect Select OrderID, CustName From Shipments
Both EXCEPT and INTERSECT are “distinct” operations; that is, all duplicate rows are removed. To clarify this concept for other programmers, the optional DISTINCT keyword may be specified after EXCEPT or INTERSECT.
/* Select All Late Orders */ Select * From OpenOrders Where ExpShipDate<Current_Date And OrderID In (Select OrderID From OpenOrders Except Distinct Select OrderID From Shipments)
SQL PRECOMPILER ENHANCEMENTS
The following is a quick run down on some of the enhancements beneficial to embedded SQL programmers.
USING AN UPDATE WITH A DATA STRUCTURE
For some time now, embedded SQL programmers have had the ability to insert a row into a table from a data structure, as in this RPG example:
D dsOrderLine E DS ExtName(OrderLine) C/Exec SQL C+ Insert Into OrderLine C+ Values(:dsOrderLine) C/End-Exec
The Update statement now works with a data structure as well. The syntax for the update uses the ROW keyword, as follows:
D dsOrderLine E DS ExtName(OrderLine) C/Exec SQL C+ Update OrderLine C+ Set Row=:dsOrderLine C+ Where OrderNo=:OrderNo C+ And LineNo=:LineNo C/End-Exec
Of course, the columns in the table or view should match the fields in the data structure in number and compatible data type. It would be nice if only a subset of columns could be updated as follows (although the manual seems to imply that this isn’t possible.):
D dsShipInfo DS D ShipNo 10I 0 D ShipQty 9P 3 D BOQty 9P 3 D InvAmt 17P 4 C/Exec-SQL C+ Update OrderLine C+ Set (ShipNo, ShipQty, BOQty, InvAmt)=:dsShipInfo C+ Where Current Of OrderLines C/End-Exec
USING THE GET DIAGNOSTICS STATEMENT
As with SQL routines, embedded SQL programs may now use the GET DIAGNOSTICS statement to retrieve information about the status of the last executed SQL statement. In prior releases, variables in the SQL Communications Area provided this feedback. Using GET DIAGNOSTICS in embedded SQL has the advantage of being able to return multiple condition messages, returning more message data, and being consistent with procedures and functions written in the SQL procedural language.
SUPPORT FOR BINARY AND VARBINARY KEYWORDS
The SQL precompiler has been enhanced to allow high-level languages to recognize the new Binary and VarBinary data types. The SQLType keyword is used to declare special data types for use by SQL, as shown in this RPG example:
D TrackData S SQLType(Binary:50)
The SQL precompiler simply replaces the above declaration with the following standard code:
D TrackData S 50A
Since a standard character variable is used for character and binary data, it appears the only reason for using the SQLType keyword is to assure the precompiler that you recognize the distinction between binary and character data. Within the RPG program, TrackData is used just like any other character variable.
Here is a declaration of a VarBinary variable in C:
SQL TYPE IS VARBINARY(50) TrackData;
RPG PRE-COMPILER ENHANCEMENTS
The RPG pre-compiler now includes support for qualified data structure names, nested /Copy directives, and the LikeRec and LikeDS keywords. The precompiler’s inability to handle some of the features of the standard compiler like nested /COPYs has been a bane to many. Thankfully, the pre-compiler can now handle some of these features.
For example, using host variables from qualified data structures is just as you’d expect, with the data structure name and subfield name separated by a period:
DdsScreenData DS Qualified D Item D DueDate C/Exec SQL C+ Select * C+ From MfgOrders C+ Where MfgItem=:dsScreenData.Item C+ And DueDate>=:dsScreenData.DueDate C/End-Exec
As with the Qualified keyword, the LikeDS and LikeRec keywords generate data structures requiring the use of qualified subfield names. Before V5R3, these qualified names were not allowed to be used by embedded SQL.
UNICODE SUPPORT
Unicode is an encoding scheme designed to be used as a universal character set. Instead of using different character sets to identify different alphabets, Unicode aims to store all characters in the world within a single character set. In addition to UCS-2, the iSeries now supports Unicode UTF-8 and UTF-16 transformation formats. UTF-8’s CCSID is 1208 and UTF-16’s CCSID is 1200.
Create Table UnicodeDemo ( UCS2 Graphic(10) CCSID 13488, UTF8 Char(10) CCSID 1208, /* New in V5R3 */ UTF16 Graphic(10) CCSID 1200) /* New in V5R3 */
For a brief explanation of Unicode, see the character conversion portion of the SQL Reference guide.
In the future, I suspect iSeries applications will migrate to one of the Unicode translation formats in order to provide superior international character support.
EXTERNAL PROCEDURE CALLS FOR SERVICE PROGRAMS
A subprocedure (or function) in a service program can now be the target of a stored procedure call. Here’s a sample stored procedure definition and the corresponding procedure interface in RPG program OrderReport:
Create Procedure xxxxx/OrderReport (parmStartDate In Date, parmEndDate In Date) Result Sets 1 External Name 'xxxxx/OrderReport(GETDATA)' Language RPGLE Parameter Style General Reads SQL Data // RPG procedure interface for program // order report P GetData B Export D GetData PI D StartDate D D EndDate D
The External Name keywords specify the library, service program name, along with the subprocedure name enclosed in parenthesis. In RPG, exported subprocedure names are always forced to upper case by default, so make sure the subprocedure name is in upper case in your Create Procedure definition.
This is another welcome enhancement, as a service program’s procedures can now be accessed directly by SQL as a stored procedure call without writing a “wrapper” program.
IMPROVING PERFORMANCE OF SQL ROUTINES
Although it’s not an “enhancement,” another thing worth mentioning is a new section in the V5R3 SQL programming guide that discusses how to improve the efficiency of routines (stored procedures, functions, and the like) written in SQL. This section gives some background on how the SQL procedural processor translates SQL code into C, and notes several things that can hinder the efficiency of the code. This material can be found in “improving performance of procedures and functions” on the iSeries Information Center.
THE POWER OF SQL ON THE ISERIES
During my long, dreary, overextended days on V3R2, I was convinced SQL/400 was forever doomed to be lacking in features when compared with other implementations of SQL. As evidenced with the new powerful join techniques, encryption functions, and pre-compiler enhancements, the iSeries implementation of SQL is as powerful as they come. I used to envy the features offered by other systems, including those in Microsoft Access! Now, it’s the other way around: I wish other systems had the features of the iSeries. After two articles I still didn’t cover all the enhancements in this release!
However, the one thing I envy in other versions of SQL is the capability to issue an UPDATE statement using a join syntax (similar to SQL Server’s syntax). Since IBM has been so good to us, I’m hoping it will make this enhancement sometime soon.
Special thanks to IBM’s Kent Milligan for supplying the syntax for the embedded UPDATE statement using a host structure.
Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. E-mail: msansoterra@itjungle.com
For the lateral correlation rewrite, the example shown above is incorrect as it omitted the LEFT JOIN from the original query. It should be something like this:
Select O.OrderID, O.CustName, OL.LineTotal, OC.OrdChgTotal, OT.TaxTotal
From SalesOrderHeader O
Left Join Lateral (Select Sum(1) As LineTotal From SalesOrderDetail Lines Where Lines.OrderID=O.OrderID) As OL ON 1=1
Left Join Lateral (Select Sum(ChgAmt) As OrdChgTotal From SalesOrderCharges Charges Where Lines.OrderID=O.OrderID) As OC ON 1=1
Left Join Lateral (Select Sum(TaxAmt) As TaxTotal From SalesOrderTaxes Taxes Where Taxes.OrderID=O.OrderID) As OT ON 1=1