Overlaid Packed Data In Data Structures
August 16, 2016 Jon Paris
Note: The code accompanying this article is available for download here. Sometimes things that should be simple give us difficulty. Jon Paris recently helped a Four Hundred Guru reader to solve a problem with overlaid packed-decimal data in data structures. We’re sharing their conversation for the benefit of others.
Hi, Jon: My question has to do with overlaid packed-decimal data in a data structure. I store a date in CYMD format in a seven-digit, packed-decimal field. I am trying to extract the two-digit year from that. Dcl-Ds cymdDate; currentDate packed(7); currentYear packed(2) overlay(currentDate:2); End-Ds; When I set currentDate to a value of 1160701 and looked at the value of currentYear in debug, I saw a value of 07, but I was expecting a value of 16. I changed the current year to be defined as follows: Dcl-Ds cymdDate; currentDate packed(7); currentYear packed(2) overlay(currentDate:1); End-Ds; Using this definition, currentYear comes out correctly as 16. Can you help me to understand why the overlay starting at 2 gives me the 3-4 position while the overlay starting at 1 gives me the 2-3 position? –A Four Hundred Guru Reader Jon Paris answers: Thanks for the question and for the opportunity to explain a couple of things that can bite you if you are not aware of the technical details underpinning them. In order to fully explain this I have to go through some basics, so if some of this seems a little elementary, please excuse me. First of all we need to understand that on IBM i, packed numbers are stored in nyble/nibble form, i.e., each digit is represented by a half-byte (nyble). A seven-digit number containing the value 1234567 would be stored as shown below.
The right-hand (low-order) nyble of the fourth byte is used to represent the sign of the number. For positive numbers this is normally hexadecimal F, although C, which is used on mainframes, is also considered valid. For negative numbers the value would be D. For any given packed field, the number of bytes occupied is the number of digits + 1 for the sign, divided by 2, and the result rounded up if required. So a seven-digit number occupies 7 + 1 = 8 / 2 = 4 bytes. But what if there are an even number of digits? How are they stored? Basically a zero is normally placed in the first nyble of the number. But read on. So instead of 1234567, let’s use a “date” such as you did in your example. It isn’t really a date of course–just a number that by convention we treat as a date. Since I’m writing this on July 2nd, 2016 I’ll use that value (1160702). You can see the storage layout in the table below:
So far so good, but when you did the overlay for currentYear by specifying that you wanted it to overlay starting in position 2 you were asking it to overlay starting in byte 2, not digit position 2. Since currentYear1 occupies two bytes ( ( 2 + 1 ) / 2 = 1.5, rounded up to 2 ), it will cover bytes 2 and 3 and will contain the hex value 6070. When you displayed the value in the debugger, because the value was defined as a two digit packed, the first nyble (the 6) was ignored and the last nyble (1) was treated as a positive sign. The result was that you saw the value 07. When you changed the overlay position to 1 then you were defining currentYear2 as occupying bytes 1 and 2. So the value the debugger showed you was 16 because once again it ignored the first nyble (1) and the 0 was treated as the positive sign. Having said all that, had you actually run a program that used this technique you might have been writing to me with a slightly different question. This is because, in most cases, attempting to use currentYear would result in a decimal data error! The reason is that while the debugger may accept invalid values in the first and sign nybles, normal mathematical operations are not so tolerant. Even if you happened upon a situation where you avoided a decimal data error, this would still not be a good practice and should not be used. After all, it confused you (and you wrote it), so imagine what it would do to the programmers who come after you? The simplest modification to the code that would allow it to work correctly would be to use your original data-structure approach, but define the “date” field as zoned decimal (S), not packed. This will cause the individual digits to each be in a separate byte and then your original overlay approach (assuming the overlay is also changed to zoned) works just fine. The resulting code would look like this: Dcl-Ds cymdDate; currentDate zoned(7); currentYear zoned(2) overlay(currentDate:2); End-Ds; The biggest problem with anything that relies on the underlying data storage layout is that it is not immediately obvious to those who come after what you are doing. I would recommend wrapping such code in a subprocedure named GetYYfromCYMD or something similar. That way anyone reading the code in the future will know immediately what you are doing without having to concern themselves with the mechanics. Here is a very simple example of such a subprocedure. Dcl-Proc GetYYfromCYMD; Dcl-PI *N zoned(2); cymd zoned(7) Const; End-Pi; Dcl-Ds cymdDate; date zoned(7); year zoned(2) overlay(date:2); End-Ds; date = cymd; Return year; End-Proc; The downloadable code contains a short program that includes this subprocedure. There are a variety of other options that you could also have used. For example: currentYear = %Subst(%EditC( CurrentDate, 'X' ) : 2: 2 ); The ‘X’ edit code on the %EditC preserves any leading zeros, and the %Subst extracts the relevant characters. In this case of course, currentYear would be an alpha value. Once again this would be a good candidate for a subprocedure to make the intent more obvious. In fact, since a data structure is implicitly a character field, you could avoid the %EditC use this expression instead: currentYear = %Subst( cymdDate : 2: 2 ); The result would be more efficient, but possibly less obvious. You could also use the date BIF %SubDt for example to extract the year portion after using %Date to convert your numeric “date”. %SubDt(%Date(currentDate: *CYMD) : *Y) ) However, you only want the last two digits of the year, so it quickly gets clumsy and is far from the most efficient method. I could probably think of many more ways to extract the year, but the “best” method really depends on what you actually want to do with the extracted year number. I hope this helps you to understand why your original examples behaved the way they did and also why you should sometimes be cautious of accepting the values shown to you in debug at face value. Remember: When you have any doubts about packed values in debug, the ability to use Eval variableName:x to display the value in hex is your friend. Jon Paris is one of the world’s most knowledgeable experts on programming on the System i platform. Paris cut his teeth on the System/38 way back when, and in 1987 he joined IBM’s Toronto software lab to work on the COBOL compilers for the System/38 and System/36. He also worked on the creation of the COBOL/400 compilers for the original AS/400s back in 1988, and was one of the key developers behind RPG IV and the CODE/400 development tool. In 1998, he left IBM to start his own education and training firm, a job he does to this day with his wife, Susan Gantner–also an expert in System i programming. Paris and Gantner, along with Paul Tuohy and Skip Marchesani, are co-founders of System i Developer, which hosts the new RPG & DB2 Summit conference. Send your questions or comments for Jon to Ted Holt via the IT Jungle Contact page.
|