Guru: Alternate SQL Row-Selection Criteria Revisited

February 24, 2020 Ted Holt

At the RPG & DB2 Summit last October, I chanced to overhear Rob Bestgen of IBM telling Paul Tuohy of System i Developer about a practice that he and others in IBM Lab Services had seen, a practice which they found horrifying. “Oh, no!” I thought. “I’ve written about that for itjungle.com!” Are you ready to be horrified?

Rob was telling Paul that they have seen SQL queries in which portions of the WHERE clause are enabled or disabled through the use of switches. I did not invent this technique — I realized more than a few years ago that I’ve never had an original thought in my life. But I have used the technique many times — with great success, I might add — and even wrote about it in this august publication in 2003, almost 16 1/2 years ago. Here’s how it works.

Let’s say we have a query that retrieves a set of customers. (A more realistic example would be to retrieve customer orders, but I’m trying to minimize the amount of source code.) We need to be able to build this set from three different criteria. Sometimes we need to retrieve all the customers. Sometimes we need only those customers to whom we’ve assigned a certain class code. Sometimes we need only the customers within a range of ZIP (postal) codes.

How do we do this? We could write the query once and clone it twice, but many of us are averse to cloning source code. If we change one thing, we have to change the copies too. Or we could use switches to enable and disable pieces of a WHERE clause.

dcl-s  SelectAll       char(1)    inz('0');
dcl-s  SelectByClass   char(1)    inz('0');
dcl-s  SelectByZip     char(1)    inz('0');

dcl-s  iClass          char(1);
dcl-s  iFromZip        char(5);
dcl-s  iThruZip        char(5);

dcl-ds  Customer  qualified;
    Account       packed ( 5);
    Name          char   (20);
    ZipCode       char   ( 5);
    Class         char   ( 1);
end-ds;

exec sql declare Inp cursor for
   select cusnbr, cusname, cuszip, cusclass
     from customers
   where
       (
          (:SelectAll = '1')
          OR
          (:SelectByClass = '1' AND cusclass = :iClass)
          OR
          (:SelectByZip = '1' AND
                cuszip BETWEEN :iFromZip AND :iThruZip)
        );

exec sql  open Inp;

dow '1';
   exec sql  fetch Inp into :Customer;
   if SqlState >= '02000';
      leave;
   endif;
*** do something with the retrieved row
enddo;

exec sql  close Inp;

These are the switches.

dcl-s  SelectAll       char(1)    inz('0');
dcl-s  SelectByClass   char(1)    inz('0');
dcl-s  SelectByZip     char(1)    inz('0');

The values of these switches activate and deactivate row-selection criteria. If SelectAll has the value 1, this is the WHERE clause.

WHERE 
 (                                     
  ('1' = '1')                  
OR                                  
  ('0' = '1' AND cusclass = ' ')                
OR                                  
  ('0' = '1' AND cuszip BETWEEN '     ' AND '     ')

The first condition is true for every row, so every row gets selected. If SelectByClass is 1 and iClass is B, this is the WHERE clause:

WHERE 
 (                                     
  ('0' = '1')                  
OR                                  
  ('1' = '1' AND cusclass = 'B')
OR 
  ('0' = '1' AND cuszip,1,5 BETWEEN '     ' AND '     ')
 )

The first condition is false, but the second one proves true for any rows that have a customer class of B. Class B customers are selected.

IBM does not like this technique. Rob told Paul and me — I could not restrain myself from joining the conversation — that this sort of thing is pretty much guaranteed to make the query optimizer ignore indexes.

What do the folks at IBM think we should do instead? Use dynamic SQL.

dcl-s  SelectAll       char(1)    inz('0');
dcl-s  SelectByClass   char(1)    inz('0');
dcl-s  SelectByZip     char(1)    inz('0');

dcl-s  iClass          char(1);
dcl-s  iFromZip        char(5);
dcl-s  iThruZip        char(5);

dcl-s  Stmt            varchar(512);

dcl-ds  Customer  qualified;
    Account       packed ( 5);
    Name          char   (20);
    ZipCode       char   ( 5);
    Class         char   ( 1);
end-ds;

Stmt = 'select cusnbr, cusname, cuszip, cusclass from customers;

select;
   when SelectByClass = '1';
      Stmt += ' where cusclass = ?';
   when SelectByZip = '1';
      Stmt += ' where cuszip between ? and ?';
endsl;

exec sql prepare x from :Stmt;

exec sql  declare Inp cursor for x;

select;
   when SelectByClass = '1';
      exec sql  open Inp using :iClass;
   when SelectByZip = '1';
      exec sql  open Inp using :iFromZip, :iThruZip;
   other;
      exec sql  open Inp;
endsl;

dow '1';
   exec sql  fetch Inp into :Customer;
   if SqlState >= '02000';
      leave;
   endif;
*** do something with the retrieved row
enddo;

exec sql  close Inp;

The code is slightly longer, but no more complex. It works the same way, in that switches still control the selection of rows. Notice that there are three opens, since each version of the WHERE has a different argument list.

However, the code is not identical. The static version allows you to enable both selection by class and selection by ZIP code at the same time. The dynamic version could be made to work that way, but I didn’t go to the trouble to make that happen.

Now that I have delivered the message from IBM to you, permit me to end with a few observations.

First, as technical editor of The Four Hundred, I do my best to deliver practical information to help you do your job. That’s true of the articles I write and those written by other authors. I do all I can to make sure that what we publish is not only accurate, but best practice. When I published this technique in 2003, I did so with a clear conscience, believing that I was sharing a legitimate programming technique.

Second, the performance of dynamic SQL has improved since 2003. In those days, every expert I listened to urged me to avoid dynamic SQL as much as possible.

Third, performance is relative. A table scan against a table with millions of rows does not perform well in general. A table scan against a few thousand rows may be no big deal.

Fourth, dynamic SQL still suffers from a limitation it’s always had — the program doesn’t know it’s using the tables. In the examples given above, Display Program References (DSPPGMREF) tells me that the program that uses static SQL uses the customers file. DSPPGMREF against the dynamic version doesn’t. That means that your documentation package probably doesn’t know either.

And last, since IBM knows that some of us use this technique, perhaps they could make the optimizer look for it and optimize the query accordingly. The folks at IBM are smart (I mean this sincerely, not sarcastically), and if such is possible, they can find a way to make it happen.

I still don’t see anything wrong with this static SQL technique. To me, it’s one more tool in my toolbox. I leave it to you to decide when and where to use it, if at all.

3 thoughts on “Guru: Alternate SQL Row-Selection Criteria Revisited”

Christopher Burns says:

February 24, 2020 at 7:50 am

I use the switch technique (or a reasonable facsimile) all the time. Most notably, with custom built “Work With” type subfiles in a business application. I include a filter line for the user to narrow down the list. It has been very popular with my client base. And it involves a where clause made up of expressions such as these:

WHERE :c2custno = 0 or dbCustNo = :c2custno

So if the user leaves the customer number field on the filter line empty, the database row is selected. Otherwise the customer number in the row is compared to the filter.

Rob Berendt says:

February 24, 2020 at 9:25 am

Another technique I’ve seen is to always use the same where clauses but just stuff the ones you want all this time with loval and hival (IDK if you also have to state or is null).
This:
where
(
(:SelectAll = ‘1’)
OR
(:SelectByClass = ‘1’ AND cusclass = :iClass)
OR
(:SelectByZip = ‘1’ AND
cuszip BETWEEN :iFromZip AND :iThruZip)
);

Become:
where
((cusclass BETWEEN :iFromClass AND :iThruClass)
AND
(cuszip BETWEEN :iFromZip AND :iThruZip)
);
if you want all you just stuff low values into :iFromClass and iFromZip and high values into :iThruClass and :iThruZip
if you want a single class you stuff the same value in the from and to.

Betsy Robbins says:

February 25, 2020 at 10:56 am

Horrified? No, I’m delighted! I was in the process of writing a dynamic cursor in a place where I really thought it was overkill, when I saw this tip. Now I can easily choose between a handful of unprocessed records from a small table, or one specific unprocessed record from that table, based on whether the key fields were passed to the program. (I’m converting a batch program that runs once an hour for five months, to trigger and data-queue driven real-time processing; but I need to leave the batch option in place for now.) Thanks, Ted!

This Issue Sponsored By

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search