Case-Insensitive Sorting and Record Selection with Query/400, Take Two
July 20, 2005 Dear Colleagues
Alert and astute reader Sarah wrote in response to the article, “Case-Insensitive Sorting and Record Selection with Query/400,” that the two sorting options I mentioned are not equivalent. After checking it out, I learned she’s right.
In the EBCDIC coding sequence, lowercase letters precede uppercase letters, which precede numeric digits. Punctuation and special characters are scattered throughout the sequence. See Chapter 9 of the V5R2 Query manual for more information. Sorting a file of item information by item class yields output of the following nature:
ITCLS ITNBR ITDSC k8 A1120 WHOZIT NOZZLE BU D881 4" FLAPPITER B1 D880 3" FLAPPITER B1 D882 5" FLAPPITER K4 A1119 WHOZIT K7 A1121 WHOZIT HOSE 2A A101 WIDGET 2C A103 WIDGET MOUNT 25 A102 WIDGET HOLDER 36 A104 WIDGET BRACKET
Notice that the lowercase k precedes the uppercase B.
Selecting collating sequence option 2 causes English-language systems that work from CCSID 37 to sort in more or less the same sequence, except that uppercase and lowercase letters are weighted equally.
ITCLS ITNBR ITDSC BU D881 4" FLAPPITER B1 D880 3" FLAPPITER B1 D882 5" FLAPPITER K4 A1119 WHOZIT K7 A1121 WHOZIT HOSE k8 A1120 WHOZIT NOZZLE 2A A101 WIDGET 2C A103 WIDGET MOUNT 25 A102 WIDGET HOLDER 36 A104 WIDGET BRACKET
Sarah pointed out that option 5, System sort sequence, places digits ahead of letters. It also equates letters that have diacritical marks with their unembellished counterparts. Here’s the same sort using shared weights under CCSID 37.
ITCLS ITNBR ITDSC 25 A102 WIDGET HOLDER 2A A101 WIDGET 2C A103 WIDGET MOUNT 36 A104 WIDGET BRACKET B1 D880 3" FLAPPITER B1 D882 5" FLAPPITER BU D881 4" FLAPPITER K4 A1119 WHOZIT K7 A1121 WHOZIT HOSE k8 A1120 WHOZIT NOZZLE
This sequence is not exactly like ASCII, but it may be close enough for those who need to sort in an ASCII-compatible sequence.
It may also be worth mentioning that you can define a sort sequence of your own to be used in a query. From the Select collating sequence panel, choose option 3 (Define the sequence). Query/400 presents a screen into which you can key your weights of choice. The initial weights are taken from the national language sequence used on your system.
–Ted