Guru: Refactoring RPG – GOTO

August 27, 2018 Ted Holt

When I first learned COBOL, I coded loops the way all the programmers in my shop did — with GO TO. Paragraph names were labels, not routines. Then I took a class in COBOL and learned structured programming. I’ve never looked back. I wish other people felt the same way, because I don’t like to work on GOTO-laden programs.

Injudicious use of branching — in RPG that would be the GOTO and CABxx op codes — is a major reason I refactor. GOTO plays havoc with program “logic”, a word I hesitate to use in this context. The minute someone asks me to insert a pop-up window or otherwise alter the flow of a program, the whole thing falls apart.

In the following paragraphs, I share a few ways to replace branching operations with more robust programming practice. I use RPG for my examples, but what I write applies to other languages, especially procedural languages like COBOL and CL.

1. Classic Patterns

A good place to begin is to look for classic patterns and replace branching with structured op codes. The classic patterns and their RPG op codes are:

Top-tested loop (DOW)
Bottom-tested loop (DOU)
Selection
- Simple selection (IF, IF/ELSE, IF/ELSEIF/ELSE).
- Case structure (SELECT)

I feel safe in saying that all modern languages support these patterns. I would add two more patterns to the list:

Counted loop (FOR, DO)
Middle-tested loop (Not implemented in RPG)

In general, branching backwards through the code is a loop; branching forward is selection.

Here’s a question for you. Which pattern does this fit?

C     CHGCOD        CABNE     1             TAG200
C                   MOVE . . .
C                   MOVE . . .
C                   ADD  . . .
C                   MOVE . . .
C                   GOTO      TAG220
C     TAG200        TAG
C                   MOVE . . .
C                   MOVE . . .
C                   SUB  . . .
C                   MOVE . . .
C     TAG220        TAG

If you said that this is selection or if/then/else or something similar, give yourself a point. It is indeed selection, and is easily refactored like this:

C     CHGCOD        IFEQ      1                                  
C                   MOVE . . .
C                   MOVE . . .
C                   ADD  . . .
C                   MOVE . . .
C                   ELSE                                         
C                   MOVE . . .
C                   MOVE . . .
C                   SUB  . . .
C                   MOVE . . .
C                   ENDIF

Change those MOVE’s, that ADD, and that SUB to EVAL’s and you’re only a step away from free-form code.

How about this one?

C                     Z-ADD1         NX
C           NXTOPT    TAG
C           OPT,NX    CASEQ'A'       ADDSR
C           OPT,NX    CASEQ'C'       CHGSR
C           OPT,NX    CASEQ'D'       DELSR
C                     END
C                     EXCPTE1
C                     ADD  1         NX
C           NX        CABLE12        NXTOPT

If you said that it’s a bottom-tested loop, give yourself a point. That answer is correct. However, a closer look shows that it’s a counted loop, with NX ranging from 1 to 12, so if you said that it’s a counted loop, give yourself two points.

C                   FOR       NDX = 1 to 12
C     OPT(NDX)      CASEQ     'A'           ADDSR
C     OPT(NDX)      CASEQ     'C'           CHGSR
C     OPT(NDX)      CASEQ     'D'           DLTSR
C                   END
C                   EXCEPT    E1
C                   ENDFOR

What pattern does this one fit?

C     RD100         TAG
C                   READ      QCUSTCDT                               44
C     *IN44         CABEQ     *ON           RD110
C                   MOVE      . . .
C                   MOVE      . . .
C                   MOVE      . . .
C                   MOVE      . . .
C                   EXCEPT    E1
C                   GOTO      RD100
C     RD110         TAG

Did you identify this as a middle-tested loop? The READ must be done at least once. The MOVE’s and EXCPT may not be done at all. RPG, CL, and COBOL programs are full of this type of loop because this is the way programmers think. This is why I add middle-tested loops to the list of classic patterns.

Within such loops I typically find I/O operations, especially EXFMT and the various READx op codes, and scanning operations, such as the SCAN op code, the %SCAN function, and the various table-and-array lookup functions and op codes. I have already written about the practicality of middle-tested loops and I won’t belabor the point.

This code easily converts like this:

C                   DOW       '1'
C                   READ      QCUSTCDT
C                   IF        %EOF()
C                   LEAVE
C                   ENDIF
C                   MOVE      . . .
C                   MOVE      . . .
C                   MOVE      . . .
C                   MOVE      . . .
C                   EXCEPT    E1
C                   ENDDO

The following table contains some common patterns in pseudocode and their structured equivalents.

Pattern	Structured programming equivalent
A: if condition goto B . . . goto A B:	top-tested loop (do while)
A: . . . if condition goto A	bottom-tested loop (do until)
A: . . . if condition goto B . . . goto A B:	middle-tested loop
A: . . . if condition goto A . . . if condition goto A . . . if condition goto A . . . goto A	any loop with ITER
. . . if condition goto A . . . if condition goto A . . . if condition goto A . . . A:	case structure (SELECT)
if condition . . . goto A if . . . . . . goto A . . . A:	case structure (SELECT)
if condition goto A . . . A:	if/then
if condition goto A . . . goto B A: . . . B:	if/then/else

2. Indiscriminate Branching

Not all code fits these patterns. After all, they don’t call it “spaghetti code” for nothing. Here’s another example. What would you do with this one?

C     OPTION        CABNE     '1'           TAG300
 . . .
C     TAG300        TAG

Would you convert this to an if/then structure? Not so fast! How many lines do the three dots represent? If only a few, then yes, such a conversion would probably be appropriate. If a lot, then maybe not, or at least not yet.

“A few” and “a lot” are subjective terms. I prefer to keep the number of lines between an IF and the corresponding ENDIF to what I can easily view on a screen, something like 25 or fewer, certainly no more than 50. (If that seems unrealistic, keep in mind that it’s a preference, not a hard-and-fast rule.) The same applies to loops.

If those three dots represent dozens or even hundreds of lines, I look for ways to reduce that number before tackling the GOTO. I might try to refactor that code into one or more routines, as I wrote about previously.

In this case, I might try to move the lines between the GOTO and TAG into a subroutine, like this:

C     OPTION        IFEQ      '1'
C                   EXSR      OPTION1
C                   ENDIF
 . . .
C     OPTION1       BEGSR
 . . . (the code that followed the CABNE goes here)
C                   ENDSR

Whether I can create such a subroutine or not depends on what’s in those lines of code. In most cases I can get away with this. However, one thing that tends to foul up the whole works is (as you may have guessed) GOTO. Let me expand the example slightly.

C     OPTION        CABNE     '1'           TAG300
C                   ADD       . . .
C                   MOVE      . . .
C     CTYPE         IFNE      1
C                   MOVE      . . .
C                   GOTO      SKIP1
C                   ENDIF
C                   MOVE      . . .
C                   MOVE      . . .
C     SKIP1         TAG
C                   MOVE      . . .
C                   MOVE      . . .
C                   MOVE      . . .
C     TAG300        TAG

There is another GOTO/TAG pair, but they are within the new routine. There are no branches to destinations before the first line (CABNE line) or after the last line (TAG300). If there are no branches to SKIP1 from elsewhere in the program, we can create a new subroutine.

C                   IF        OPTION = 1
C                   EXSR      OPTION1
C                   ENDIF
 . . .
C     OPTION1       BEGSR
C                   ADD       . . .
C                   MOVE      . . .
C     CTYPE         IFNE      1
C                   MOVE      . . .
C                   GOTO      SKIP1
C                   ENDIF
C                   MOVE      . . .
C                   MOVE      . . .
C     SKIP1         TAG
C                   MOVE      . . .
C                   MOVE      . . .
C                   MOVE      . . .
C                   ENDSR

But suppose that the code between the CABNE and the TAG does branch out of the range. Do we lose hope? No. Branching out of the range does not automatically prevent you from creating a routine. Consider this:

C     TAG120        TAG
 . . .
C     OPTION        CABNE     1             TAG300
C                   ADD       . . .
C                   MOVE      . . .
C     CTYPE         IFNE      1
C                   MOVE      . . .
C                   GOTO      TAG120
C                   ENDIF
C                   MOVE      . . .
C                   MOVE      . . .
C     TAG300        TAG

We have a problem: Control may branch to TAG120 from within the new subroutine, and branching in and out of subroutines is a no-no. Here’s an easy way around that.

D TAGNAME         S             10
 . . .
C     TAG120        TAG
 . . .
C                   IF        OPTION = 1
C                   EXSR      OPTION1
C     TAGNAME       CABEQ     'TAG120'      TAG120
C                   ENDIF
 . . .
C     OPTION1       BEGSR
C                   EVAL      TAGNAME = *BLANKS
C                   ADD       . . .
C                   MOVE      . . .
C     CTYPE         IFNE      1
C                   MOVE      . . .
C                   EVAL      TAGNAME = 'TAG120'
C                   LEAVESR
C                   ENDIF
C                   MOVE      . . .
C                   MOVE      . . .
C                   ENDSR

I’ve defined a 10-byte character variable called TAGNAME, and I use it to control branching after calling the subroutine. One TAGNAME variable is sufficient to handle all such cases in a program.

I’ve added a new first line to the subroutine to set the TAGNAME to blanks. If the OPTION1 subroutine needs to branch to TAG120, it loads the TAGNAME variable with an appropriate value (I use the name of the TAG itself) and exits the subroutine. Upon return to the caller, control branches as it always did.

While it might seem that I made no progress by removing one GOTO and adding one CABEQ, I have made progress by creating a new subroutine. Further refactoring may eventually eliminate the TAGNAME variable.

3. Purism Is Not Required

I’ve had success with these simple techniques. I’ve no doubt they will work for you, too. I don’t refactor to eliminate GOTO out of philosophical reasons. I eliminate GOTO when the use of GOTO prevents me from changing the program source code to handle new requirements. Otherwise, I leave it alone. You don’t have to remove all GOTO’s from working code. They won’t breed, as rats or roaches would.

One thought on “Guru: Refactoring RPG – GOTO”

Hassan Farooqi says:

August 28, 2018 at 4:01 pm

Back in 1979 when I was programming in Cobol on IBM 370, I read the book “Structured Programming in Cobol”, and never used GO TO again. Replaced it with a variety of PERFORM statements. You have put it beautifully, so a man with average programming aptitude can also understand.

This Issue Sponsored By

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search