A Reusable Routine for Doubly-Linked Lists, Part 1
January 19, 2011 Miguel Ortiz Martín and Ted Holt
In Implementing Linked Lists in RPG, Ted Holt explained that a linked list consists of a sequence of data records, or nodes, where each node keeps a reference–a pointer–to the next node in the sequence. This sequence might or might not be sorted. Due to the fact that each node points only to the next node in the sequence, this kind of list is also called a singly-linked list. Doubly-linked lists differ from singly-linked lists in that each node of the list keeps pointers to both the previous and the next nodes in the sequence. Whereas a singly-linked list can only be navigated from beginning to end, a doubly-linked list can be navigated in both directions. A doubly-linked list can be viewed as two singly-linked lists formed from the same data items in opposite orders. When Ted wrote the example programs for his linked list and binary tree articles, he wrote the minimal amount of code needed to illustrate the concepts. In this article, Miguel Ortiz Martín presents a more practical approach. First, Miguel’s code consists of a module of reusable code. The module has routines that create, delete, and otherwise manipulate doubly-linked lists. Whereas Ted’s routines have to be modified for each different application, Miguel’s routines take care of all the details, no matter what type of data the nodes contain. Second, Miguel’s lists use sentinel nodes. Use of sentinel nodes greatly simplifies the list-manipulation routines, as you will see. Let’s take a look at how this reusable code is put together. External Storage When designing a linked list–either singly- or doubly-linked–one important design fact is how to store data within nodes. One option is to directly store data within the node itself. This option is called the internal storage model. This is the approach Ted used in his articles. The other option is that nodes merely store a reference to the data. This option is called the external storage model. Two advantages of the internal storage model are that memory management is simpler and data access is more efficient. The external storage model, on the other hand, has the advantage of being more generic, which makes it easy to build generic linked lists for any kind of data. Under the external storage model, a doubly-linked list node looks like this: d node_t ds Qualified d Based( typedef ) d data * d prev * d next * The fact that the node contains only pointers is the reason that a node can “contain” any type of data. (The node doesn’t really contain data; it points to data stored elsewhere in memory.) And the fact that a node can contain any type of data is the reason that Miguel’s routines are generic and reusable. A procedure that creates a node must first allocate enough memory for both the node itself and its data. The Append routine, which adds a new node to the end of the list, is a good example. p Append b d pi d prev_p * Value d elem_p * Value d elemsize 10i 0 Const //local variables d prev_node ds Likeds( node_t ) d Based( prev_p ) d this_node ds Likeds( node_t ) d Based( this ) /free //allocate memory required to hold the new node's info & data node_p = %alloc( %size( this_node )); node.data = %alloc( elemsize ); //store the node's data memcpy( this_node.data : elem_p : elemsize ); //rearrange pointers prev_node.next = this; this_node.prev = prev_p; this_node.next = prev_node.next; /end-free p e The Append routine begins by allocating two areas of memory: one for the node, and one for the data that is associated with the node. Memory copy (memcpy) is an MI routine that comes in handy when dealing with data of an unspecified type. After the node has been created and the data stored, the pointers must be set so that the previous node points to the new node and vice versa. Note that the procedure receives an extra parameter with the length of the list’s elements. This extra parameter is necessary because RPG pointers do not keep information about the data to which they point. That is, they don’t know the length of the data, so the calling routine must tell the procedure the amount of memory required to hold the data. In the same way, a procedure that deletes a node must de-allocate both the node and the associated data. p Delete b d pi d prev_p * Value d elem_p * Value //local variables d prev_node ds Likeds( node_t ) d Based( prev_p ) d this_node ds Likeds( node_t ) d Based( this ) /free //rearrange pointers prev_node.next = this_node.next; dealloc(N) this_node.data; dealloc(N) this; /end-free p e Sentinel Nodes Sentinel nodes are nodes that do not hold data, but only pointers to next and/or previous nodes. Placing sentinel nodes at the beginning and ending of a linked list ensures that every node in the list has a next and previous node. The fact that every node in the list, including the first and the last nodes, has both a predecessor and a successor simplifies list operations, such as searching for an element and traversing the list. On the other hand, adding sentinel nodes complicates other operations like creating a new empty list. Creating sentinel nodes is almost like creating nodes that have data. The only differences are: 1) no memory for data is assigned; and 2) the addresses of the sentinel nodes are stored within the list definition. An implementation for a doubly-linked list’s list definition in RPG should look like this: d list_t ds Qualified d Based( typedef ) d head_sentinel... d * d tail_sentinel... d * d numelems 10i 0 d elemsize 10i 0 Note: Several fields will be added later for other purposes. To create an empty doubly-linked list, begin by creating both sentinel nodes, set the next and previous pointers to link the nodes, and allocate the memory required to store the list’s header definition itself. Subprocedure DblLnkst_init illustrates the process. p DblLnkLst_init... p b Export d pi Like( linklist_t ) d elemsize Like( linklist_elemsize_t ) d Const d comparator Like( linklist_comparator_t ) d Options( *nopass ) d Value //locals d this_list ds Likeds( list_t ) d Based( this ) d head_sentinel ds Likeds( node_t ) d Based( head_sentinel_p ) d tail_sentinel ds Likeds( node_t ) d Based( tail_sentinel_p ) /free // Allocate memory for head and tail sentinel nodes head_sentinel_p = %alloc( %size( head_sentinel )); tail_sentinel_p = %alloc( %size( tail_sentinel )); head_sentinel.data = *null; head_sentinel.next = tail_sentinel_p; head_sentinel.prev = *null; tail_sentinel.data = *null; tail_sentinel.next = *null; tail_sentinel.prev = head_sentinel_p; //Assign memory to store information about the head of the list this = %alloc( %size( this_list )); this_list.head_sentinel = head_sentinel_p; this_list.tail_sentinel = tail_sentinel_p; this_list.elemsize = elemsize; this_list.numelems = 0; return this; /end-free p e Comparing and Sorting Nodes As we have said previously, a linked list–either singly- or doubly-linked–may or may not be sorted. Whether you choose to sort your list as you load it (with an insertion sort algorithm) or sort it once it is fully loaded, there are several sorting algorithms you can use to achieve this objective. It is your responsibility to tell the list manager module how to compare the list’s nodes. Languages like C, C++, and Java have functions like strcmp() or methods like compare() that compare data values. Such functions (or methods), receive two strings (or objects) say “a” and “b”, and return a negative integer value when “a” is less than “b”, a zero value when a is equal to “b”, and a positive value when a is greater than “b”. Since RPG does not have a similar built-in function, you must define a comparator, a procedure that compares nodes. The comparator procedure must:
As we have already mentioned, it is necessary to pass the element size as a parameter because RPG pointers do not keep information about the length of the data to which they point. If you wish, you can pass another, optional parameter to tell the comparator function to compare elements in reverse order. A positive value (the default) compares elements in ascending order, and a negative value compares in descending order. The DBLLNKLST module contains a simple comparator based on hex values that supports this logic. p list_comparator_string... p b d pi 10i 0 d a * Value d b * Value d size Like( elemsize_t ) d Const d seq 10i 0 Options( *nopass ) d Const //locals d result s 10i 0 Inz( 0 ) /free if %str( a : size ) < %str( b : size ); result = -1; elseif %str( a : size ) > %str( b : size ); result = 1; endif; if %parms( ) >= 4; result *= seq; endif; return result; /end-free p e This procedure is adequate if a simple comparator based on hex values is enough to sort your list. When “Hex” Comparison Is Not Enough The supplied comparison procedure should be adequate for sorting doubly linked lists of character strings, but might fail in comparing numeric data lists or more complex data ones, i.e., lists of data records. For this kind of lists, you have to define your own sorting algorithm that must match the default sorting algorithm parameter passing. This is particularly true for most numeric data types. Except for the unsigned integer data types–3u, 5u, 10u and 20u types–numeric data does not sort properly based on its hex value. For example, to sort a list of “short integers,” you should use a comparator like the following one: //==================================================================// //Procedure....: list_comparator_5i0 // //Description..: // //==================================================================// p list_comparator_5i0... p b d pi 10i 0 d a * Value d b * Value d size Like( elemsize_t ) d Const d seq 10i 0 Options( *nopass ) d Const //variables locales d int_a s 5i 0 Based( a ) d int_b s 5i 0 Based( b ) d result s 10i 0 Inz( 0 ) /free if int_a > int_b; result = 1; elseif int_a < int_b; result = -1; endif; if %parms( ) >= 4; result *= seq; endif; return result; /end-free p e Here’s another example of a comparator procedure. This one sorts file QIWS/QCUSTCDT by ZIP code and city name: //==================================================================// //Procedure....: elem_comparator // //Description..: Sort QIWS/ QCUSTCDT Table by // // 1.- Zip Code // // 2.- City Name // //==================================================================// p elem_comparator... p b d pi 10i 0 d a_p * Value d b_p * Value d size 10i 0 Const d seq 10i 0 Options( *nopass ) d Const //variables locales d a ds Likeds( lstRcd_t ) d Based( a_p ) d b ds Likeds( lstRcd_t ) d Based( b_p ) d result s 10i 0 Inz( 0 ) /free if a.zipcod < b.zipcod; result = -1; elseif a.zipcod > b.zipcod; result = 1; else; if a.city < b.city; result = -1; elseif a.city > b.city; result = 1; endif; endif; if %parms( ) >= 4; result *= seq; endif; return result; /end-free p e These user-defined comparison procedures must be defined in the module that creates the list, and must be passed to the DBLLNKLST module via a procedure pointer parameter to the function, List_init, which initializes the list object. This parameter is defined as an optional one, and, if not passed, the default comparison procedure will be used. The comparator procedure is stored within the list’s definition data structure as a procedure pointer. d list_t ds Qualified d Based( typedef ) d head_sentinel... d * d tail_sentinel... d * d numelems 10i 0 d elemsize 10i 0 d comparator ProcPtr Note: Several additional fields will be added later for other purposes. Optionally, you can use DblLnkLst_changeComparator function to change the comparison function and then sort the list with an alternative order. More to Come That’s enough information for today. Part 2 will tell you more about the inner workings of the DBLLNKLST module and show you how to use it in your applications. Miguel Ortiz-Martin has been working as a software developer in AS/400 environments since 1990. He is currently employed by SEMPSA, a Spanish subsidiary of British Cookson Group, in Madrid, Spain. Send your questions or comments for Miguel to Ted Holt via the IT Jungle Contact page. RELATED STORIES Implementing Binary Trees in RPG Implementing Linked Lists in RPG Resolving Field Names at Runtime, Part 3
|