Guru: Faster Fetching, Revisited

November 13, 2017 Paul Tuohy

While visiting a client recently, I was shown how they were using a multi-row FETCH into a multiple-occurrence data structure to retrieve large sets of data. When I asked why they were not using a data structure array instead, I was referred to an IT Jungle article in which Ted Holt answered a question on whether it was faster to perform a row-at-a-time FETCH or a multi-row FETCH when using embedded SQL.

Ted provided a comparison between a row-at-a-time FETCH, a multi-row FETCH into a data structure array, and a multi-row FETCH into a multiple-occurrence data structure. At the end of the article, Ted concluded, “The file with which I tested had 1.3 million records. The first two programs used 11 seconds of CPU time. However, the last one used only seven seconds of CPU time. Using an array data structure was no faster than the single-row fetch. The multiple-occurrence data structure was faster than the array data structure.”

Loathe as I am to disagree with Ted, his conclusion did not correspond with my experience. I would expect the data structure array and the multiple-occurrence data structure to be comparable in performance. (Underneath the covers, it should be the exact same process for both.) And I would expect both of them to be faster that a row-at-a-time FETCH.

I copied the programs from the original article, removed the logic for printing details and added logic to calculate the length of time (in milliseconds) between declaring the cursor and closing the cursor. The three programs ran over a database with the same format and the same number of rows.

This is the code for the program that performed a row-at-a-time FETCH.

ctl-Opt option(*srcstmt:*nodebugio);

dcl-Ds Transaction qualified inz;
  Batch     packed(3:0); 
  Code      char(2); 
  Order     char(7); 
  Item      char(15); 
  Warehouse char(3); 
  Quantity  packed(11:3); 
end-Ds;

dcl-C SqlEof     const('02000');
dcl-S DetailLine Char(132);
dcl-S fromstamp  Timestamp;
dcl-S tostamp    Timestamp;
dcl-S diff       Int(10);

exec sql
   set option commit = *none;

*inlr = *on;

fromstamp = %timestamp();

exec sql
   declare Input cursor for
      select batch, tcode, ordno, itnbr, 
             house, trqty
        from transaction
        order by key;

exec sql
   open Input;

dow '1';
   exec sql
      fetch next from Input into :Transaction;
   if sqlstt >= SqlEof;
      leave;
   endif;
enddo;

exec sql
   close Input;

tostamp = %timestamp();

diff = %diff(toStamp: fromStamp: *ms);
dsply ('Single: ' + %char(diff));
return;

This is the equivalent program using a multi-row FETCH into a data structure array.

ctl-Opt option(*srcstmt:*nodebugio);

dcl-Ds Transaction qualified inz dim(1000);
  Batch     packed(3:0); 
  Code      char(2); 
  Order     char(7); 
  Item      char(15); 
  Warehouse char(3); 
  Quantity  packed(11:3); 
end-Ds;

dcl-S getRows    Int(10) inz(%elem(transaction));
dcl-C SqlEof     const('02000');
dcl-S DetailLine Char(132);
dcl-S fromstamp  Timestamp;
dcl-S tostamp    Timestamp;
dcl-S diff       Int(10);

exec sql
   set option commit = *none;

*inlr = *on;

fromstamp = %timestamp();

exec sql
   declare Input cursor for
      select batch, tcode, ordno, itnbr, 
             house, trqty
        from transaction
        order by key;

exec sql
   open Input;

dou sqlstt >= SqlEof;
   exec sql
      fetch next from Input
          for :getRows rows
          into :Transaction;
enddo;

exec sql
   close Input;

tostamp = %timestamp();

diff = %diff(toStamp: fromStamp: *ms);
dsply ('Array: ' + %char(diff));
return;

This is the equivalent program using a multi-row FETCH into a multiple-occurrence data structure.

ctl-Opt option(*srcstmt:*nodebugio);

dcl-Ds Transaction qualified inz occurs(1000);
  Batch     packed(3:0); 
  Code      char(2); 
  Order     char(7); 
  Item      char(15); 
  Warehouse char(3); 
  Quantity  packed(11:3); 
end-Ds;

dcl-S getRows    Int(10) inz(1000);
dcl-C SqlEof     const('02000');
dcl-S DetailLine Char(132);
dcl-S fromstamp  Timestamp;
dcl-S tostamp    Timestamp;
dcl-S diff       Int(10);

exec sql
   set option commit = *none;

%occur(transaction) = 1;
*inlr = *on;

fromstamp = %timestamp();

exec sql
   declare Input cursor for
      select batch, tcode, ordno, itnbr, 
             house, trqty
        from transaction
       order by key;

exec sql
   open Input;

dou sqlstt >= SqlEof;
   exec sql
      fetch next from Input
          for :getRows rows
          into :Transaction;
enddo;

exec sql
   close Input;

tostamp = %timestamp();

diff = %diff(toStamp: fromStamp: *ms);
dsply ('MODS: ' + %char(diff));
return;

I called the three programs multiple times and in different combinations. There were always minor differences on each call, but the duration differences between the three were always similar to these:

Method	Time
Single	14161000
Array	1835000
MODS	1842000

Both multi-row FETCH methods were always approximately eight times faster than a single row FETCH. The difference between the FETCH into a multiple-occurrence data structure and a data structure array were negligible and easily accounted for by the fact that my system was doing other things! Although the data structure array program was usually faster (by a small margin), in a few of the tests, the multiple-occurrence data structure program was faster (by a small margin).

So, it would appear that a multi-row FETCH is significantly faster that a row-at-a-time FETCH. Given the choice between a data structure array and a multiple-occurrence data structure, I will opt for the data structure array as being the easier of the two to manage and to comprehend.

Editor’s Note:

I am grateful to Paul Tuohy for revisiting this subject.

I did not mean for my earlier article to be taken as authoritative. I was asked a question, and I expressed my opinion that multiple-row fetch was faster than single-row fetch, even though I had no statistics to back up that opinion. I shared the results of one test that I had run, hardly a conclusive finding.

To anyone who uses multiple-occurrence data structures for multiple-fetch because of my article, I say that what you’re doing is not wrong and I wouldn’t change existing programs. Like Paul, I prefer data structure arrays to multiple-occurrence data structures, and I would recommend you use data structure arrays also.

–Ted

Faster Fetching

5 thoughts on “Guru: Faster Fetching, Revisited”

glenngundy says:

November 13, 2017 at 7:46 am

Hi Ted,

Instead of a loop, can you simply do one fetch and specify “fetch first 1000 rows only”?

Glenn

Ted Holt says:

November 13, 2017 at 9:38 am

Yes, Glenn, but only if you won’t need more than 1000 rows in the result set.

Lynne Noll says:

November 13, 2017 at 9:49 am

I can easily use the MODS columns in any other embedded SQL statements. I can’t do this with data structure array elements. Unless there was an enhancement to handle subscripts easily that I missed, this feature of the MODS seems to be a large advantage over data structure arrays, as the precompiler doesn’t seem to handle subscripts.

Jamie says:

November 16, 2017 at 8:12 am

If I want a fixed/max number of records where that number is not too high, I will frequently do a single fetch with “for x rows” into an array DS and then loop through the array, as opposed to looping through multiple fetches.

I have a question – how does the value chosen for “Exec SQL Set Option ALWBLK = ???” affect the results, if at all?

DW Johnson says:

November 20, 2017 at 3:29 pm

Additionally once you get your data into the Data Structure Array, you can perform array operations on it. SORTA to reorder data in a subfile based on a column clicked on.
D.J.

This Issue Sponsored By

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search