Using SQL Joins With Tree Structures

March 20, 2013 Ted Holt

Note: The code accompanying this article is available for download here.

The reason tree structures are important to me is not that they are so numerous, but that I use them every day. I don’t write programs to process tree structures every day, but many of the programs I have written still run every day. For that reason, I must know fast ways to extract data from tree structures. One such method is the SQL join.

This article is a rewrite of the topic “Efficient Database Retrieval” from my article Recursion and the Alternatives, which I wrote eight years ago. I am revisiting this topic because this powerful technique deserves more attention than what I gave it then.

To illustrate this technique, let’s assume that you and I work for a manufacturer of wheelbarrows. Our company offers three models of wheelbarrows: small, medium, and large. (Sorry, no grandes, ventis, trentes, etc.)

First, we need an item master file.

select * from products order by 1

Item ID       Description               Type
==========    =====================     ====
AXLE          Axle                      COMP
BARROW1       Small wheelbarrow         FG  
BARROW2       Medium wheelbarrow        FG  
BARROW3       Large wheelbarrow         FG  
BOLT1         Small bolt                COMP
BOLT2         Big bolt                  COMP
BUCKET1       Small bucket              COMP
BUCKET2       Medium bucket             COMP
BUCKET3       Large bucket              COMP
HANDLE1       Short handle              COMP
HANDLE2       Medium handle             COMP
HANDLE3       Long handle               COMP
KITCOMMON     Common items              KIT 
KITHARDWARE   Hardware                  KIT 
LABELSAFETY   Safety label              COMP
LABEL1        Product label small       COMP
LABEL2        Product label medium      COMP
LABEL3        Product label large       COMP
LEG           Leg                       COMP
STRAP         Strap                     COMP
TIRE          Tire                      COMP
TUBE          Inner tube                COMP
WHEEL         Wheel                     ASSY
WHEELASSEM    Wheel assembly            ASSY
WHEELBLANK    Wheel blank               COMP

The third column tells us the type of each item:

FG = finished good
COMP = component
ASSY = assembly
KIT = kit

We also need a product structure file, to tell us what makes up what.

select * from prodstruct order by 1, 2

            Component                Quantity
Parent       sequence  Component        per
===========  ========  ============= ========
BARROW1          10    BUCKET1           1 
BARROW1          20    HANDLE1           2 
BARROW1          30    KITCOMMON         1 
BARROW1          40    LABEL1            1 
BARROW2          10    BUCKET2           1 
BARROW2          20    HANDLE2           2 
BARROW2          30    KITCOMMON         1 
BARROW2          40    LABEL2            1 
BARROW3          10    BUCKET3           1 
BARROW3          20    HANDLE3           2 
BARROW3          30    KITCOMMON         1 
BARROW3          40    LABEL3            1 
KITCOMMON        10    KITHARDWARE       2 
KITCOMMON        20    LABELSAFETY       1 
KITCOMMON        30    LEG               2 
KITCOMMON        40    STRAP             1 
KITCOMMON        50    WHEELASSEM        1 
KITHARDWARE      10    BOLT1             4 
KITHARDWARE      20    BOLT2             2
WHEEL            10    WHEELBLANK        2
WHEELASSEM       10    AXLE              1
WHEELASSEM       20    TIRE              1
WHEELASSEM       30    TUBE              1
WHEELASSEM       40    WHEEL             1

It isn’t readily obvious that this is a tree structure, but this graphic (click to enlarge graphic after opening the link), based on the product structure of a small wheelbarrow, makes it more understandable.

Let’s suppose further that we have been asked to make the computer tell someone how many of each base part (a part that has no components) is needed to fill an order. If a customer orders a dozen small wheelbarrows, what items do we need?

A common way to determine such information is to work downward through the tree, multiplying as we go along. For instance, 12 wheelbarrows require 12 common-item kits, which require 12 wheel assemblies, which require 12 wheels, which require 24 blanks.

The order for 12 small wheelbarrows requires the following traversal:

12 buckets
12 x 2 = 24 short handles
12 x 1 x 2 = 24 hardware kits
12 x 1 x 1 = 12 safety labels
12 x 1 x 2 = 24 legs
12 x 1 x 1 = 12 straps
12 x 1 x 1 x 1 = 12 axles
12 x 1 x 1 x 1 = 12 tires
12 x 1 x 1 x 1 = 12 tubes
12 x 1 x 1 x 1 x 2 = 24 wheel blanks
12 x 1 x 1 = 12 product labels

This sort of tree traversal requires a recursive routine, such as the one found under “An Example of Recursion” in the afore-mentioned article of eight years ago.

A recursive routine is great for retrieving small amounts of data, such as the bill of material for a single item. But it’s too slow for retrieving large amounts of data. In those cases, I use a join.

Here’s the idea. Begin at top level, with the finished goods. Explode them into their components. Explode those components into their components. Then explode those components into their components. Continue the process until the explosion returns no result set.

I wrote a CL program to illustrate the technique. It uses the same database tables that I used above. There’s also another table, Product Explosion, to hold the result set.

create table ProdExpl
  (FinishGood     char (12),
   QtyPer1        dec  ( 3),
   Parent         char (12),
   ParentType     char ( 4),
   QtyPer2        dec  ( 3),
   CompSeq        dec  ( 3),
   Component      char (12),
   CompType       char ( 4),
   Level          dec  ( 3))

The first column, FinishGood, ties all components together for a finished good, regardless of level. The Component column holds the base parts; i.e., the ones that have no components of their own. The Parent column is the parent item of the Component item.

Quantity-Per-1 is the number of Component in the finished good. Quantity-Per-2 is the number of Component in its parent.

This is only one possibility for the formatted data. Put whatever data you need in the result set.

Here’s the explosion routine.

/* ============================================================ */
/* Use SQL joins to explode a bill of materials                 */
/* ============================================================ */

pgm

   dcl &SqlCmd       *char   len( 1024 )
   dcl &NbrCurRcd    *dec    len(   10 )
   dcl &Level        *dec    len(    3 )
   dcl &LevelA       *char   len(    3 )

 /* Create the work files, or clear them if they already exist. */

   clrpfm     ProdExpl

   clrpfm     qtemp/WorkA
   monmsg     cpf3142 exec(do)
      crtdupobj  ProdExpl *libl *file qtemp WorkA
   enddo

   clrpfm     qtemp/WorkB
   monmsg     cpf3142 exec(do)
      crtdupobj  ProdExpl *libl *file qtemp WorkB
   enddo

 /* Prime the work file with finished goods. */

   callsubr   IncrLevel
   chgvar &SqlCmd +
      ('insert into WorkB +
          select fg.item, 1, +
                 fg.item, fg.type, 1, +
                 0, +
                 fg.item, fg.type,' *cat &LevelA *bcat +
           'from products AS fg +
           where fg.type = ''FG''')
   runsql sql(&SQLCmd) commit(*none)

   dowhile '1'

 /* If WorkB does not have any records, the explosion is complete. */

      rtvmbrd  file(WorkB) nbrcurrcd(&NbrCurRcd)
      if (&NbrCurRcd *le 0) +
           then(leave)

 /* Add the most recent explosion to previous explosions. */

      cpyf fromfile(WorkB) tofile(ProdExpl) mbropt(*add)

 /* Explode the next level. */

      callsubr Explode
   enddo
   return

/* ============================================================ */
/* Count the levels in the bills of material.                   */
/* ============================================================ */
subr IncrLevel

   chgvar   &Level   (&Level +1)
   chgvar   &LevelA   &Level

endsubr

/* ============================================================ */
/* Retrieve the next level of components.                       */
/* ============================================================ */
subr Explode

   /* Copy WorkB to WorkA, replacing data, & clear WorkB. */
   /* Renaming files is faster than copying data. */

   rnmobj   WorkA  objtype(*file) newobj(WorkX)
   rnmobj   WorkB  objtype(*file) newobj(WorkA)
   rnmobj   WorkX  objtype(*file) newobj(WorkB)
   clrpfm   WorkB

 /* Explode by joining thru the product structure. */

   callsubr   IncrLevel

   chgvar  &SqlCmd +
    ('insert into WorkB +
      select a.FinishGood, a.QTYPER1 * ps.QtyPer,+
             a.component, a.CompType, ps.QtyPer, +
             ps.compseq, +
             ps.component, comp.type,' *cat &LevelA *bcat +
     'from WorkA as a +
      join prodstruct as ps +
        on a.component = ps.parent +
      join products as comp +
        on ps.component = comp.item')
   runsql sql(&SqlCmd) commit(*none)

endsubr

endpgm

Does it work? You bet! Here’s the exploded bill of material for one small wheelbarrow.

(Click graphic to enlarge.)

The first SQL command returned one row per finished good.

(Click graphic to enlarge.)

This says that a small wheelbarrow is made of one small wheelbarrow. This row is not necessary. I included it for one reason only: to prime the SQL statement in subroutine Explode, which does all the work.

The first iteration of the Explode subroutine adds the first level of components.

(Click graphic to enlarge.)

The second call to Explode adds the next level of components. I see that a small wheelbarrow has two small handles (quantity-per #1).

(Click graphic to enlarge.)

And so it goes, until the items in the last level of components have no components of their own.

I recently wrote a program to explode more than 100 thousand bills of material to their base components. It ran in seven minutes on an antiquated machine. I didn’t write and run the recursive equivalent, but from past experience, I know that using a recursive routine would have been much slower, perhaps slow enough to make running the program impractical.

You can also use recursive SQL queries to do the same sort of thing. However, I’ve found that the recursive query does not always give me a way to get the result set in the format I need it in. (Or maybe I still haven’t mastered recursive SQL queries.) In case you’re interested, I’ve included a link below to an article that Mike Sansoterra wrote about recursive SQL.

If tree-traversal through joins is a technique that you can use, I hope the material in this article and my previous article will give you plenty of information to work from. I’m glad to revisit this topic, and I hope my coverage of this technique will help many people.

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search

Using SQL Joins With Tree Structures

Share this:

Share this:

Leave a Reply Cancel reply

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search