Recursion and the Alternatives
March 9, 2005 Ted Holt
The code for this article is available for download.
Recursion is usually defined as the ability of a process (a program, a subprocedure, and so forth) to call itself. It is a fun topic to write about and a fun technique to use in programming. Recursion often simplifies the effort required to code a task in a computer language. However, recursion is not without its drawbacks. In this article, I hope to help readers determine when recursion is appropriate and to suggest alternatives to use when recursion is not suitable.
An Example of Recursion
To illustrate the concepts that I present here, I have created a miniature bill-of-materials database. Those who are not familiar with bill of materials (“bill” for short) might think of a bill as the list of ingredients in a recipe, but what you are making might be a can of soup or a car instead of a cake. I like to use bill of materials as an example because it is practical, unlike the factorial and exponentiation examples commonly found in computer science texts.
A bill of materials is recursive by nature, since a component of a manufactured good may consist of components of its own. A green salad consists of green leafy vegetables, toppings, and salad dressing. The salad dressing consists of oil, vinegar, spices, and the like. Some of the toppings, such as croutons, consist of more than one ingredient. One of them is bread, which is also made of various ingredients.
My example structure file, STRUCT, contains two fields–parent and component. In a real bill-of-materials application, the file would have other fields, but I wish to keep this example as simple as possible. Notice that some items are parents in some records and components in others.
PARENT COMPONENT A10 C32 A10 F98 A10 K22 A11 C31 A11 D13 A11 D18 A11 F23 A12 N45 A12 G81 A15 A19 A15 C97 A17 C31 A17 F98 A17 K23 A19 A12 A19 B17 G81 C32 G81 C97 Z09 A19 Z09 M35 Z09 C31
Look at the structure for A15. It contains an A19, which contains an A12, which contains a G81, which contains a C32.
Let’s add another file to the illustration. This is the ITEM file, which describes the items. In our example, there are only two fields–item number and item class.
ITEMNBR ITCLASS A10 20 A11 20 A12 20 A15 20 A17 20 A19 20 B17 05 C31 00 C32 00 C97 11 D13 05 D18 11 F23 11 F98 15 G81 11 K22 17 K23 15 M35 15 N45 17 Z09 20
For the program example, we are primarily interested in two classes. Class 00 indicates raw materials. Class 20 indicates items that we sell (end items). Logical file ENDITEM includes Class 20 items only.
A R ITEMREC PFILE(ITEM) A ITEMNBR A ITCLASS A K ITEMNBR A S ITCLASS COMP(EQ '20')
Finally, we’re ready for some program source code. Our challenge for today is to find the raw materials used in each item we sell. For items A10, A11, A17, and Z09, that’s easy, because their raw materials are found at the first level in the structure file. Or are they? It turns out that Z09 uses two raw materials, only one of which is at the first level. The raw materials for items A12, A15, and A19 are also buried in lower levels. How will we find them?
We can use recursion. The following RPG program has a recursive subprocedure, Chase, which follows a bill of materials down level to level until it finds a class 00 item.
The program reads the logical file of Class 20 items and calls the Chase routine for each item. Chase accepts two parameters–the end item and the parent item whose bill is to be exploded. When Chase finds a raw material, it writes a record of the end item and raw material to file ITEMSEARCH.
H dftactgrp(*no) actgrp(*new) option(*srcstmt: *nodebugio) FEndItem if e k disk prefix(EI_) rename(ItemRec:EIRec) FItem if e k disk prefix(I_) FStruct if e k disk prefix(S_) FItemSearcho e disk prefix(SRCH_) D Chase pr D EndItemNbr like(I_ItemNbr) value D Parent like(I_ItemNbr) value C dow '1' C read EndItem C if %eof() C leave C endif C callp Chase (EI_ItemNbr: EI_ItemNbr) C enddo C eval *inlr = *on P Chase b D Chase pi D EndItemNbr like(I_ItemNbr) value D Parent like(I_ItemNbr) value D SaveComponent s like(I_ItemNbr) C SaveKey klist C kfld Parent C kfld SaveComponent C Parent setll Struct C dow '1' C Parent reade Struct C if %eof() C return C endif C S_Component chain Item C if not %found() C return C endif C if I_ItClass = '00' C eval SRCH_Parent = EndItemNbr C eval SRCH_Component = I_ItemNbr C write SearchRec C return C endif C eval SaveComponent = S_Component C callp Chase (EndItemNbr: S_Component) C SaveKey setgt Struct C enddo P e
In order to understand the Chase routine, let’s follow what happens when the main loop reads item A19 from ENDITEM. The system activates the first instance of Chase, which I’ll call Chase 1, passing it the arguments A19, A19. Chase 1 begins reading the components of A19. The first one it finds is A12. Chase 1 activates another copy of itself–we’ll call it Chase 2–passing it the values A19, A12. Chase 2 reads the first component of A12–G81–and activates Chase 3 with the arguments A19, G81. Chase 3 reads the first component of G81, which is C32. Since C32 is a 00-class item, Chase 3 writes to ItemSearch and returns to Chase 2. Chase 2 reads the next component of A12, which is N45 and activates Chase 3 again. Since N45 has no components, control returns to Chase 2. Since A12 has no more components, control returns to Chase 1. Chase 1 reads the next component of A19, which is B17, and activates Chase 2. B17 has no components, so control returns to Chase 1. A19 has no more components, so the Chase routine is finished. The following list shows the activations of the Chase routine:
Level End item Parent Chase 1 A19 A19 Chase 2 A19 A12 Chase 3 A19 G81 Chase 3 A19 N45 Chase 2 A19 B17
Running this RPG program against the entire ENDITEM file produces the following output:
PARENT COMPONENT A10 C32 A11 C31 A12 C32 A15 C32 A17 C31 A19 C32 Z09 C32 Z09 C31
Efficient Database Retrieval
Recursive routines like Chase work well with small amounts of data, but give them a lot of work to do and they quickly run out of steam. The immense amount of I/O caused by repeatedly resetting the file pointer via the SETLL and SETGT operations slows the routine down considerably. I learned this lesson the hard way several years ago. Fortunately, I found a better alternative. It turns out that the database can find the components much more quickly. Here is the basic idea.
Begin by extracting the 00-class components at the first level. Then treat the non-00-class components of the first level as parents and look for their 00-class components. After that, take the non-00-class components at this second level, treat them as parents, and extract their 00-level components. Continue this process from level to level until there are no more non-00-class components to consider.
In this example, the first iteration would find the 00-class components of items A10, A11, A17, and Z09. The second iteration would find item A12, whose raw material is two levels down from the end item. The third iteration would find item A19, whose raw material is three levels from the end item. The last iteration finds raw material for items A15 and Z09, which have raw materials at the fourth level.
Here is the complete output:
PARENT COMPONENT A11 C31 A17 C31 Z09 C31 A10 C32 A12 C32 A19 C32 A15 C32 Z09 C32
If you compare this output to the output of the recursive routine, you will find the same records, but not in the same order. But that’s OK, because we’ve got plenty of ways to sort the result record set.
My version of this algorithm uses two work files. I use the first work file to hold the items that need to be exploded to the next structure level. In the second one I place the items that will have to be exploded at the next lower level. After each iteration, I move the contents of the second work file to the first work file. When the second file turns up empty, I know that there are no more levels of structure to look through.
Because it is so long, I do not include the source code of my routine here. To see how I implement this routine, take a look at ITEM02R in the downloadable code. I take some short cuts for performance reasons, but the basic idea is the same.
Iteration
The other alternative to recursion is iteration. I’ve heard several times that someone has proven that anything that can be done with recursion can also be done with iteration, but no one has ever told me who proved it. Let’s look at an example.
Suppose you need to multiply all the elements of an array, similar to the way the XFOOT op code adds the elements of an array. You can use recursion to carry out this task because the product of all the elements of an array is the first array element times the product of the remaining elements of the array. See the recursive ArrProduct subprocedure in the following example:
H dftactgrp(*no) actgrp(*new) D ArrProduct pr 8f D Array 5i 0 dim(10) D BgnElem 5u 0 value D EndElem 5u 0 value D A s 5i 0 dim(10) D B s 5i 0 dim(10) D C s 5i 0 dim(10) D Product s 8f /free A(1) = 1; A(2) = 2; A(3) = 3; A(4) = 4; A(5) = 5; Product = ArrProduct(A:1:5); B = 2; Product = ArrProduct(B:1:%elem(B)); C = 2; C(8) = *zero; Product = ArrProduct(C:1:%elem(C)); *inlr = *on; /end-free P ArrProduct b D ArrProduct pi 8f D Array 5i 0 dim(10) D BgnElem 5u 0 value D EndElem 5u 0 value /free if Array(BgnElem) = *zero; return *zero; elseif BgnElem = EndElem; return Array(BgnElem); else; return Array(BgnElem) * ArrProduct(Array: BgnElem+1: EndElem); endif; /end-free P e
But why not use a loop instead?
P ArrProduct b D ArrProduct pi 8f D Array 5i 0 dim(10) D BgnElem 5u 0 value D EndElem 5u 0 value D D Ndx s 5u 0 D Product s 8f /free Product = 1; for Ndx = BgnElem to EndElem; Product *= Array (Ndx); endfor; return Product; /end-free P e
Here’s another possibility for a looping solution:
P ArrProduct b D ArrProduct pi 8f D Array 5i 0 dim(10) D BgnElem 5u 0 value D EndElem 5u 0 value D D Ndx s 5u 0 D Product s 8f /free Product = 1; Ndx = *zero; dow Product <> *zero and Ndx < EndElem; Ndx += 1; Product *= Array (Ndx); enddo; return Product; /end-free P e
A loop will perform faster and use less resources than a recursive routine. Recursion is unnecessary and an inferior technique for solving this problem.
So When Do I Recur?
Consider using recursion when iteration would require arrays to store values for each occurrence of the loop. The Chase routine required repositioning of the STRUCT file upon entry to Chase and when calling the next level of Chase. Repositioning the file pointer correctly each time was successful because each occurrence of Chase had its own Parent parameter and SaveComponent local variable. To have used a loop would have required two arrays to store these values and a pointer to keep up with the active element of each array.
Consider using recursion if very little I/O is involved. The example RPG program reads the entire ENDITEM file, which might have thousands of records. If the program were rewritten to accept an end item number as an input parameter and return a component as an output parameter, and if the program were called only from an interactive inquiry program, very little I/O would take place and the routine would be better than the file-joining database technique.
I like recursion. I have liked it since I was introduced to it some twenty-plus years ago in computer science class. Recursion is a good technique, but it has its place. I don’t use recursion if I have a better alternative.
RELATED ARTICLE:
“Recursive Calls Using Subprocedures,” by Kevin Vandever
Ted Holt is managing technical editor of Four Hundred Guru. His favorite recursion exercise, which he wrote in Pascal in 1983, was a program that found all possible ways to place eight queens on a chessboard in such a way that no two queens were attacking one another.
Click here to contact Ted Holt by e-mail.