Guru: Continue A Sequence When Inserting
January 15, 2018 Ted Holt
Carlos writes, “Hey, Ted! I have a question regarding inserting rows into a file that is keyed on a sequence number. I need to insert more rows, and I need the new rows to have the next available sequence numbers. Can I perform this task with an SQL INSERT statement, or do I have to use record-level access?”
If the sequence number were defined as an identity column, Carlos would have no problem. Unfortunately, the sequence number column is a simple numeric field. Fortunately, I was able to give Carlos two solutions. It would not surprise me if you can suggest others.
Let’s set up the problem. The table (physical file) into which Carlos wants to insert the new rows — let’s call it FILEA — looks something like this:
Sequence | DataA |
1 | A |
2 | B |
3 | C |
4 | D |
5 | E |
The table with the new data — FILEB — looks like this:
DataB |
M |
N |
O |
Carlos wanted to insert three new rows beginning with sequence number 6. The easy way to do this is to use a function I wrote a long time ago — the INCR function. It’s a simple RPG routine that returns an integer value that is one greater than the previous value.
insert into filea select incr(5), field3 from fileb order by datab
Now FILEA looks like this:
Sequence | DataA |
1 | A |
2 | B |
3 | C |
4 | D |
5 | E |
6 | M |
7 | N |
8 | O |
If you don’t want to use my function, you can use IBM’s code instead. Create a sequence. I’ll unimaginatively call it SEQ1.
create or replace sequence mylib.seq1 as dec (3) start with 6
Use the NEXT VALUE expression to retrieve the next sequence number from SEQ1.
insert into filea select next value for seq1, datab from fileb order by datab
If it were not for one little-bitty limitation of CREATE SEQUENCE and ALTER SEQUENCE, I would not have mentioned my INCR function. That limitation is this: the START WITH value must be a numeric literal. I tried to retrieve the largest sequence number from FILEA into a variable and set the sequence to the next integer value, but no soap. However, I was able to make it work with INCR. Here’s a dynamic compound statement that worked properly.
begin declare NextSeq integer; values (select max(field1) from filea) into NextSeq; insert into filea select incr(NextSeq), datab from fileb order by datab; end
I’m very grateful for the functionality that IBM packs into the world’s best relational database management system. I’m also grateful that I can write my own code when necessary. We have the best of both worlds.
–Build Temp Tables
Declare Global Temporary Table FileA (
Sequence Numeric(9, 0) ,
DataA Char(10)
) With Replace;
Declare Global Temporary Table FileB (
DataB Char(10)
) With Replace;
–Load Tables
Insert Into QTemp.FileA Values
(1, ‘A’),
(2, ‘B’),
(3, ‘C’),
(4, ‘D’),
(5, ‘E’);
Insert Into QTemp.FileB Values
(‘M’),
(‘N’),
(‘O’);
–Check your data
Select * From QTemp.FileA;
–Run the insert with Row_Number
Insert Into QTemp.FileA
(Select Row_Number ()
Over (Order by (DataB)) + (Select Count(*)
From QTemp.FileA),
a.*
From QTemp.FileB a);
–Validate Result
Select * From QTemp.FileA;
Hey Ted. Great topic. Assuming FILEA has a unique key of SEQUENCE, I’m wondering what would happen if two or more of these insert jobs were running at the same time (each using a different input file of course) and they tripped over each other when writing to FILEA. Is it possible to modify the compound statement to use the next available number? Would you have to change the one INSERT statement to use a cursor?
Would a CTE help to do this?
Hey Ted,
You could also use the row_number function along with a little subquery.
Insert into filea
select row_number( ) over () + (select max(Sequence) from filea), datab
from fileb
order by datab;
This “seeds” the row number with the highest Sequence number in filea giving you the incremented numbers you are wanting from the row_number function for the insert.
Of course no other insert operation could run at the same time.
Same idea as Barry’s just a little different in little different in the subquery.
Glenn, when using a sequence, each job get’s it’s own set of numbers. You can adjust how big that set is when you create the sequence. The result is that sequence numbers may not be inserted sequentially if multiple jobs are pounding on it. By default each job caches 20 values on first access, so Job1 gets sequence numbers 1-20, job2 get 21-40, and job3 gets 41-60. If these three jobs then start inserting records rapid fire you could get sequence numbers added like 1, 41, 21, 22, 42, 43, 2, 23, 3, 44, etc. The jobs don’t step on each other, and they don’t contend for data area locks unless they all run through their cache at the same time, but the sequence numbers are not sequential, and they could have gaps of up to 19 numbers if a job grabs a set of numbers and only uses one of them before it ends. You can also set the cache to 1 value, but then you would have contention for the data area. This wouldn’t be an issue if the file contention you would get by grabbing the largest number in the file each time wasn’t an issue.