Guru: A Philosophically Engineered Approach to the Processing of Parameters, Take Two
February 1, 2021 Ted Holt
A strange thing happened to me recently. I was writing a new program and like a good programmer, was not reinventing the wheel. I was calling a utility program that calculated the values I needed. However, this utility program, which had always worked correctly, was giving me invalid data. How is it possible that a program can work properly for a long time and suddenly go bad?
The answer to this question was ably answered by Rick Cook, who wrote “Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning.” In this case, I was the idiot. I’ll come back to this in a bit.
But first, are you up for a challenge? Look at the following two-column, two-row table.
First | Second |
1 | 4 |
2 | 6 |
What values will be in the table after executing the following statement?
update ThisTable set First = Second / 2, Second = First * 2
I’ll come back to this later. Let’s get to the subject at hand — using parameters in programs.
There are at least two ways to classify parameters. Here’s one way:
- Input: The value of the parameter upon entry matters and we will not change it.
- Output: The value of the parameter upon entry does not matter and we will change it.
- Input-output: The value of the parameter upon entry matters and we may or may not change it.
Here’s another:
- Required: Failure of a caller to pass a value to this parameter is a hard (terminal) error.
- Optional: If the caller does not pass a value to this parameter, we will not read or change this parameter. If the parameter is an input or input-output parameter, we will supply a default value.
As I wrote in December, some years ago I developed a safe way to process parameters. I suggest you read that article before continuing.
The thrust of the article is that each program should have an initialization routine that:
- Verifies that all required parameters have been passed to the program
- Makes working copies of the parameters
- Loads default values into the working copies of the optional parameters that were not passed
- Validates the value of each parameter if possible
- Sends a diagnostic message to the caller for each parameter violation it finds
- Sends an escape message to the caller if at least one parameter error was found
I neglected to mention two other important bits of information.
- It is not necessary to make working copies of required input parameters. You can if you want to.
- Input-output parameters have two values — the original (unchanged) value from the parameter itself or from the default value, and the current value in the working copy.
I illustrated with a program that used three input parameters, one required and two optional. I mentioned input-output and output parameters, but I didn’t illustrate them. Today I want to fill in that gap.
Assume a business has a table of sales promotions.
Column | Data type | Comment |
ID | character 8 | Primary key |
StartDate | packed decimal 7,0 | CYYMMDD format |
EndDate | packed decimal 7,0 | CYYMMDD format |
Here’s the data:
ID | STARTDATE | ENDDATE |
NEWYEAR | 1210101 | 1210114 |
TAKE5 | 1210101 | 1210128 |
JP | 1210111 | 1210116 |
Here’s a program that other programs can call to retrieve the promotion data. The only input parameter is the promotion ID for which the data is to be returned. The other parameters are output only. The start and end dates are returned in the date data type and the duration of the campaign is calculated in days. If the requested promotion is not found in the table, the program returns a duration of zero.
**free ctl-opt actgrp(*caller); dcl-pi VAS0340R; inPromotion char (8) const; ouStartDate date; ouEndDate date; ouDuration packed (3); end-pi; dcl-f Promotions disk usage(*input) keyed; chain inPromotion Promotion; if %found(); ouStartDate = %date(StartDate: *cymd); ouEndDate = %date(EndDate: *cymd); ouDuration = %diff(ouEndDate: ouStartDate: *days) + 1; else; clear ouDuration; endif; return;
This is the sort of routine I was speaking about in the first paragraph, the type of code that runs for years without problem and suddenly breaks.
Here’s a stripped-down calling program for illustration only.
**free ctl-opt actgrp(*new); dcl-s StartDate date; dcl-s EndDate date; dcl-s Days packed (3); dcl-s PromotionID char (8); dcl-pr GETPROMOT extpgm; inPromotion char (8) const; ouStartDate date; ouEndDate date; ouDuration packed (3); end-pr GETPROMOT; *inlr = *on; PromotionID = 'TAKE5'; GETPROMOT (PromotionID: StartDate: EndDate: Days); return;
You will probably not be surprised if I tell you that StartDate, EndDate, and Days take the values 2021-01-01, 2021-01-28 and 28 respectively.
Now look at this caller and guess what the returned values will be.
**free ctl-opt actgrp(*new); dcl-s Dummy date; dcl-s Days packed (3); dcl-s PromotionID char (8); dcl-pr GETPROMOT extpgm; inPromotion char (8) const; ouStartDate date; ouEndDate date; ouDuration packed (3); end-pr GETPROMOT; *inlr = *on; PromotionID = 'TAKE5'; GETPROMOT (PromotionID: Dummy: Dummy: Days); return;
If you guessed that Dummy is 2021-01-28 and Days is 1, give yourself a round of applause. This is the problem to which I referred in the first paragraph. I was only interested in the duration, not the starting and ending days, so I passed the same dummy variable into both date fields.
The duration was improperly calculated because the programmer used the output parameters, not copies.
ouDuration = %diff(ouEndDate: ouStartDate: *days) + 1;
Here’s the same called program with working copies of the output variables.
**free ctl-opt actgrp(*caller); dcl-pi GETPROMOT; inPromotion char (8) const; ouStartDate date; ouEndDate date; ouDuration packed (3); end-pi; dcl-f Promotions disk usage(*input) keyed; dcl-s wrkStart like(ouStartDate); dcl-s wrkEnd like(ouEndDate); dcl-s wrkDuration like(ouDuration); chain inPromotion Promotion; if %found(); wrkStart = %date(StartDate: *cymd); wrkEnd = %date(EndDate: *cymd); wrkDuration = %diff(wrkEnd: wrkStart: *days) + 1; endif; ouStartDate = wrkStart; ouEndDate = wrkEnd; ouDuration = wrkDuration; return;
This version has a work variable for each output parameter. All calculations are done with the work variables. The output parameters are not referenced until just before the return to the caller. I chose not define a work variable for the first parameter. This version works properly with both of the example calling programs. Another idiot has met his match!
Do you remember the SQL question I asked above? Here’s the answer.
First | Second |
2 | 2 |
3 | 4 |
How did you do? Do you see that IBM has taken a similar approach in the UPDATE statement? The values of FIRST and SECOND are obviously saved in some sort of buffer and are not operated upon directly. How do I know? Because the calculation that changes SECOND does not use the changed value of FIRST, and also because Kent Milligan told me several years ago that the input values of the columns are saved in a buffer. The new values of the columns are not loaded until the rewrite to the table.
It may seem like a lot of trouble to define and use work variables instead of directly accessing the parameters, but I don’t think it is. Peace of mind is worth a lot.
RELATED STORIES
A Philosophically Engineered Approach to the Processing of Parameters