A Generic Character Editing Routine

July 12, 2016 Jon Paris

Note: The code accompanying this article is available for download here.

Recently I was asked if I knew of a way to edit character strings. For example, take a character string representing a product code such as “AX12345Q” and edit it to produce “AX-123-45 Q”. My initial reaction was to reach for an edit word, but sadly they only work for numerics, for some strange reason. I set about building a subprocedure that offered the necessary flexibility in insert characters and also added a few “defenses” against mismatched parameters.

Before describing the code, let’s look at the prototype for the editing routine. Since some readers may not have adopted fully free-form yet, I’m going to use the “old” type D-spec definitions in this explanation. In the downloadable code package, though, I have supplied both fixed- and free-form versions.

Here’s the prototype:

d EditString      pr           256a   varying
d   InputString                256a   Const Varying
d   EditMask                   256a   Const Varying
d   Marker                       1a   Const Options(*NoPass)

The subprocedure returns a varying length string of up to 256 characters. That should be more than big enough.

The first parameter (InputString) is the field to be edited. I added the Const keyword so that the routine could accept any size and type of character field as input. The same applies to the second parameter, which is the mask to be used. Here Const has the added benefit that a literal can also be specified as the mask parameter. The final parameter allows the character marker field to be changed, hence the option *NoPass. It only needs to be specified if a marker other than the default (the ampersand, “&” in my case) is to be used.

The mask works like this: Each “&” causes the next character in the input field to be copied to the result. Any other character in the mask string is simply copied to the result. To produce the result I showed above (“AX-123-45 Q”) would require a mask of ‘&&-&&&-&& &’.

If I need to use the & as an insert character, then I must choose an alternate character as the marker and specify it in the third parameter.

Here are a couple of sample calls taken from the sample test program included with the code package. This first one would convert a string such as ‘9735551212’ to ‘(973) 555-1212’. The second uses the third parameter to specify a space as the marker character, resulting in a string such as ‘111223333’ becoming 111-22-333.

textOut = EditString(textIn: '(&&&) &&&-&&&&') ;

mask    = '   -  -    ' ;
textOut = EditString(textIn: mask: ' ' ) ;

Now that we have looked at how to call the routine, let’s take a look at the actual code.

      // EditString Procedure - Applies edit mask to input string
    p EditString      b
    d                 pi           256a   Varying
    d   inText                     256a   Const Varying
    d   editMask                   256a   Const Varying
    d   textMark                     1a   Const Options(*NoPass)

(A) d outString       s            256a   Inz varying

    d ix              s              5i 0 Inz
    d mx              s              5i 0 Inz

     // Default marker for input chars to be copied to output is &
     //   If an alternative is required it is supplied as parm 3
(B) d marker          s              1a   Inz('&')


(C)   If %Parms >= %ParmNum( textMark ); 
          marker = textMark; 
      EndIf;

      // copy to outString according to mask marker characters.

(D)    For mx = 1 to %len( editMask ) ;

(E)       If %subst( editMask: mx: 1 ) = marker ; // Marker found

             ix += 1 ;  // So increment input position

(F)          If ix <= %Len( inText ); 
                outString += %subst( inText: ix: 1 ) ;
             Else;
                outString += ' '; // Output blank if no input
             EndIf;

          Else ;

            // Not a text marker so copy from mask to output
(G)          outString += %subst( editMask: mx: 1 ) ;

          EndIf ;

       EndFor ;

       // Check if all of input string was processed and if not
       //    copy balance to output

(H)    If ix &l
t; %Len( inText );
          outString += %Subst( inText: ( ix + 1 ) );
       EndIf;

      // return the edited outString.
        return outString ;

At (A) I defined the field outString in which the edited return value will be placed. The Inz on the definition ensures that the string is empty when I begin. Defining it as a variable length string also makes it easier to build the result as all I need to do is to use a simple += operation to add characters to the result field as you will see later. The variables ix and mx are used to track the current position in the input string (inText) and the mask (editMask) respectively.

(B) Defines the mask’s default text marker character. I have set it to an initial value of “&”. For your own purposes you might want to change it to a space as that certainly makes the masks easier to read. For my purposes though I will often need to insert spaces and so it is not a good default for me.

Next (C) comes the test that determines if the optional third parameter was passed. If it was, then it is copied into marker to replace the default.

(D) Begins the loop to perform the edit. I used the length of the mask as the controlling condition as it simplifies the testing and processing of exception conditions. (For example, more characters in the input than in the mask.)

At (E) I test if the current position in the mask represents a text marker. If so, the input position is incremented preparatory to copying the character to the result. But first at (F) I check that there are still input characters remaining, without this check the subsequent %Subst would fail. If there is an input character available, it is added to the output result. If not, then a blank is added instead.

(G) is the other side of the coin. In other words, I have a mask character that is to be added to the result string.

Finally at (H) I added a check to ensure that all characters in the original input had been processed. I could have just omitted this but felt that adding any remaining characters to the result would make it easier to detect any errors in the mask. If you would prefer to just ignore any additional input or indeed to actually signal an error then just modify this code.

You may be wondering why I wrote a subprocedure rather than (for example) demonstrating the use of the C function sprintf() from RPG. This would have worked just fine, but the construction of the mask is, in my opinion, far less obvious and as a result, would be far harder to understand for other programmers who have to subsequently maintain the code.

I hope you find this utility routine useful. If you have any similar subprocedures that you think would be of interest to other readers, please let me know.

Jon Paris is one of the world’s most knowledgeable experts on programming on the System i platform. Paris cut his teeth on the System/38 way back when, and in 1987 he joined IBM’s Toronto software lab to work on the COBOL compilers for the System/38 and System/36. He also worked on the creation of the COBOL/400 compilers for the original AS/400s back in 1988, and was one of the key developers behind RPG IV and the CODE/400 development tool. In 1998, he left IBM to start his own education and training firm, a job he does to this day with his wife, Susan Gantner–also an expert in System i programming. Paris and Gantner, along with Paul Tuohy and Skip Marchesani, are co-founders of System i Developer, which hosts the new RPG & DB2 Summit conference. Send your questions or comments for Jon to Ted Holt via the IT Jungle Contact page.

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search