Closing the Gaps
January 12, 2011 Hey, Ted
We have a database file to which records are added throughout the day. The key is a five-digit packed decimal whole number, and the program that writes to the file increments this number when adding a new record to the file. During the day, another program processes and deletes records throughout the file, leaving gaps between sequence numbers. At the end of each day, we would like to renumber the remaining records, beginning again at sequence number one, to remove the gaps and be prepared for a fresh start the next day. Can we use an SQL UPDATE statement to renumber the sequence numbers? –Brian Sure, it’s possible, but with one UPDATE command? That depends. Let’s consider some possibilities. First, let’s set the stage. Assume 50 records were written to the file during the day, and all but four–sequence numbers 1, 4, 12, and 17–were processed and deleted from your file. If the arrival sequence coincides with the sequence number sequence, you can probably get away with one UPDATE. (I can’t take credit for this technique. I learned it from Joe Celko. Visit this site and search for CloseMotorpoolGaps.) D ReSeqUpd1 pr extpgm('RESEQUPD1') D ouStatus 8a D ReSeqUpd1 pi D ouStatus 8a D SqlEof c const('02000') D AllOK c const('00-00000') D UnexpectedError... D c const('99-99999') /free *inlr = *on; ouStatus = AllOK; monitor; exec sql update reseq as a set a.sequence = (select count(b.sequence) from reseq as b where b.sequence <= a.sequence); if SqlState > SqlEof; ouStatus = '10-' + SqlState; return; endif; on-error; ouStatus = UnexpectedError; endmon; return; I say probably, because there is no guarantee the system will process the records in arrival sequence. However, since the entire file will be updated (i.e., there is no WHERE clause), chances are pretty close to certain that the system will update in arrival sequence. Suppose, however, that the sequence of the records according to key does not match arrival sequence, as could occur if you’re reusing deleted records. What then? Let’s say that the remaining records, in arrival sequence, have sequence numbers 17, 12, 4, and 1. When the system tries to assign 4 to the first record (sequence number 17), it chokes with SQL state 23505 (A violation of the constraint imposed by a unique index or a unique constraint occurred) because there is already another row with a key value of 4. In this case, you could update using a two-step process:
Here are a couple of possibilities. In this version, I add the number of active records in the file to the sequence numbers before re-sequencing them. D RowCount s 10u 0 D ReSeqUpd2 pr extpgm('RESEQUPD2') D ouStatus 8a D ReSeqUpd2 pi D ouStatus 8a D SqlEof c const('02000') D AllOK c const('00-00000') D UnexpectedError... D c const('99-99999') /free *inlr = *on; ouStatus = AllOK; monitor; exec sql select count(*) into :RowCount from reseq; if RowCount <= *zero; return; endif; if SqlState > SqlEof; ouStatus = '10-' + SqlState; endif; if SqlState >= SqlEof; return; endif; exec sql update reseq as a set a.sequence = a.sequence + :RowCount; if SqlState > SqlEof; ouStatus = '20-' + SqlState; return; endif; exec sql update reseq as a set a.sequence = (select count(b.sequence) from reseq as b where b.sequence <= a.sequence); if SqlState > SqlEof; ouStatus = '30-' + SqlState; return; endif; on-error; ouStatus = UnexpectedError; endmon; return; After the first update, the rows have sequence numbers 21, 16, 8, and 5. After the second update, the sequence numbers are 4, 3, 2, and 1, of course. Here’s a similar version, built on the assumption that sequence numbers are always positive. In the first update, subtract the highest sequence number in use in order to convert the key values to numbers less than 1. D MaxSequence s 5p 0 D ReSeqUpd2 pr extpgm('RESEQUPD2') D ouStatus 8a D ReSeqUpd2 pi D ouStatus 8a D SqlEof c const('02000') D AllOK c const('00-00000') D UnexpectedError... D c const('99-99999') /free *inlr = *on; ouStatus = AllOK; monitor; exec sql select max(Sequence) into :MaxSequence from reseq; if SqlState > SqlEof; ouStatus = '10-' + SqlState; endif; if SqlState >= SqlEof; return; endif; exec sql update reseq as a set a.sequence = a.sequence - :MaxSequence; if SqlState > SqlEof; ouStatus = '20-' + SqlState; return; endif; exec sql update reseq as a set a.sequence = (select count(b.sequence) from reseq as b where b.sequence <= a.sequence); if SqlState > SqlEof; ouStatus = '30-' + SqlState; return; endif; on-error; ouStatus = UnexpectedError; endmon; return; After the first update, the rows have key values of 0, -5, -13, and -16. After the second update, the key values are 4, 3, 2, and 1, as before. Of course, if you reorganized the file, putting the records back into key sequence, the first example would work and you wouldn’t need two updates. RGZPFM FILE(RESEQ) KEYFILE(*FILE) If UPDATE had an ORDER BY clause (but it doesn’t, and I doubt it ever will), the first version would work, no matter the physical sequence of the records. –Ted
|
I was wrong. UPDATE has an ORDER BY clause now.
https://www.itjungle.com/2018/01/08/guru-update-conundrum-finally-solved/