Circumventing Two Limitations of CPYTOIMPF
July 22, 2009 Ted Holt
Dear Intelligent Power-on-i Professional: I deem it grand of IBM to give us great tools that relieve us of the need to write programs to accomplish mundane, quotidian tasks. The Copy to Import File (CPYTOIMPF) command is such an animal. However, nothing’s perfect on this planet, and CPYTOIMPF is no exception. Here are two annoyances or problems (depending on your point of view), and how to get around them. Strip Out Unwanted Blanks I’m using CPYTOIMPF to build a CSV file. CPYTOIMPF leaves lots of blank space in the stream file. Is there any way to tell CPYTOIMPF that I don’t want the extra space? In a word, no. However, Qshell’s SED utility can compress out the unwanted blanks. First, let’s create the white-space-laden CSV file, using the usual convention: commas separate the fields and string values are in quotation marks. CPYTOIMPF FROMFILE(QIWS/QCUSTCDT) TOSTMF(custcdt.temp) MBROPT(*REPLACE) RCDDLM(*LF) STRDLM('"') FLDDLM(', ') The data looks like this: 938472 ,"Henning ","G K","4859 Elm Ave ","Dallas","TX",75217 ,5000 ,3 ,37.00 ,.00 839283 ,"Jones ","B D","21B NW 135 St","Clay ","NY",13041 ,400 ,1 ,100.00 ,.00 392859 ,"Vine ","S S","PO Box 79 ","Broton","VT",5046 ,700 ,1 ,439.00 ,.00 938485 ,"Johnson ","J A","3 Alpine Way ","Helen ","GA",30545 ,9999 ,2 ,3987.50 ,33.50 397267 ,"Tyron ","W E","13 Myrtle Dr ","Hector","NY",14841 ,1000 ,1 ,.00 ,.00 389572 ,"Stevens ","K L","208 Snow Pass","Denver","CO",80226 ,400 ,1 ,58.75 ,1.50 846283 ,"Alison ","J S","787 Lake Dr ","Isle ","MN",56342 ,5000 ,3 ,10.00 ,.00 475938 ,"Doe ","J W","59 Archer Rd ","Sutter","CA",95685 ,700 ,2 ,250.00 ,100.00 693829 ,"Thomas ","A N","3 Dove Circle","Casper","WY",82609 ,9999 ,2 ,.00 ,.00 593029 ,"Williams","E D","485 SE 2 Ave ","Dallas","TX",75218 ,200 ,1 ,25.00 ,.00 192837 ,"Lee ","F L","5963 Oak St ","Hector","NY",14841 ,700 ,2 ,489.50 ,.50 583990 ,"Abraham ","M T","392 Mill St ","Isle ","MN",56342 ,9999 ,3 ,500.00 ,.00 Enter the Qshell environment and key the following command: sed -e 's# *",#",#g' -e 's# *,#,#g' custcdt.temp > custcdt.CSV SED reads from custcdt.temp and writes to custcdt.csv, replacing the contents of the output file. The two “e” switches precede two substitution commands that SED is to carry out on each line of input. In the first substitution command, the search expression.blank, asterisk, backslash, quotation mark, comma. is between the first and second pound signs, and means zero or more blanks followed by a quotation mark and comma. The replacement string is between the second and third pound signs, and contains only a quotation mark and comma. This expression gets rid of the extra blanks after the last name, street address, and city name (second, fourth, and fifth fields). The second substitution replaces zero or more blanks followed by a comma with a single comma. This deletes the blanks that precede the numeric fields. The “g” at the end of each regular expression tells Qshell to replace all occurrences of the search argument in a line, not just the first one. The data looks like this: 938472,"Henning","G K","4859 Elm Ave","Dallas","TX",75217,5000,3, 37.00,.00 839283,"Jones","B D","21B NW 135 St","Clay","NY",13041,400,1, 100.00,.00 392859,"Vine","S S","PO Box 79","Broton","VT",5046,700,1, 439.00,.00 938485,"Johnson","J A","3 Alpine Way","Helen","GA",30545,9999,2, 3987.50,33.50 397267,"Tyron","W E","13 Myrtle Dr","Hector","NY",14841,1000,1, .00,.00 389572,"Stevens","K L","208 Snow Pass","Denver","CO",80226,400,1, 58.75,1.50 846283,"Alison","J S","787 Lake Dr","Isle","MN",56342,5000,3, 10.00,.00 475938,"Doe","J W","59 Archer Rd","Sutter","CA",95685,700,2, 250.00,100.00 693829,"Thomas","A N","3 Dove Circle","Casper","WY",82609,9999,2, .00,.00 593029,"Williams","E D","485 SE 2 Ave","Dallas","TX",75218,200,1, 25.00,.00 192837,"Lee","F L","5963 Oak St","Hector","NY",14841,700,2, 489.50,.50 583990,"Abraham","M T","392 Mill St","Isle","MN",56342,9999,3, 500.00,.00 Onward to the second question! Two-Character Field Separators Here’s one I’ve never seen before. Our corporate IT department wants us to transmit a file in which variable-length fields are separated by a two-character combination. What was wrong with CSV files? Anyway, I’ve got plenty to do, so I tried to use Copy to Import File (CPYTOIMPF). Guess what? The Field Delimiter (FLDDLM) parameter only allows one character. Any suggestions? I think we can make this work. First, run your CPYTOIMPF command, using a single character for a delimiter. Try to use a character that is not found in the data. In this example, I use the backslash character to separate the fields. CPYTOIMPF FROMFILE(QIWS/QCUSTCDT) TOSTMF('custcdt.temp') MBROPT(*REPLACE) RCDDLM(*CRLF) STRDLM(*NONE) FLDDLM('') The data in stream file custcdt.temp looks like this: 938472 Henning G K4859 Elm Ave DallasTX75217 5000 3 37.00 .00 839283 Jones B D21B NW 135 StClay NY13041 400 1 100.00 .00 392859 Vine S SPO Box 79 BrotonVT5046 700 1 439.00 .00 938485 Johnson J A3 Alpine Way Helen GA30545 9999 2 3987.50 33.50 397267 Tyron W E13 Myrtle Dr HectorNY14841 1000 1 .00 .00 389572 Stevens K L208 Snow PassDenverCO80226 400 1 58.75 1.50 846283 Alison J S787 Lake Dr Isle MN56342 5000 3 10.00 .00 475938 Doe J W59 Archer Rd SutterCA95685 700 2 250.00 100.00 693829 Thomas A N3 Dove CircleCasperWY82609 9999 2 .00 .00 593029 WilliamsE D485 SE 2 Ave DallasTX75218 200 1 25.00 .00 192837 Lee F L5963 Oak St HectorNY14841 700 2 489.50 .50 583990 Abraham M T392 Mill St Isle MN56342 9999 3 500.00 .00 Then use QShell’s stream editor, SED, to convert the single delimiter to a two-character delimiter. I’ll illustrate by replacing the backslash separator with a question mark and a vertical bar. sed 's/ */?|/g' custcdt.temp > custcdt.txt The data looks like this: 938472?|Henning?|G K?|4859 Elm Ave?|Dallas?|TX?|75217?|5000?|3? |37.00?|.00 839283?|Jones?|B D?|21B NW 135 St?|Clay?|NY?|13041?|400?|1? |100.00?|.00 392859?|Vine?|S S?|PO Box 79?|Broton?|VT?|5046?|700?|1? |439.00?|.00 938485?|Johnson?|J A?|3 Alpine Way?|Helen?|GA?|30545?|9999?|2? |3987.50?|33.50 397267?|Tyron?|W E?|13 Myrtle Dr?|Hector?|NY?|14841?|1000?|1? |.00?|.00 389572?|Stevens?|K L?|208 Snow Pass?|Denver?|CO?|80226?|400?|1? |58.75?|1.50 846283?|Alison?|J S?|787 Lake Dr?|Isle?|MN?|56342?|5000?|3? |10.00?|.00 475938?|Doe?|J W?|59 Archer Rd?|Sutter?|CA?|95685?|700?|2? |250.00?|100.00 693829?|Thomas?|A N?|3 Dove Circle?|Casper?|WY?|82609?|9999?|2? |.00?|.00 593029?|Williams?|E D?|485 SE 2 Ave?|Dallas?|TX?|75218?|200?|1? |25.00?|.00 192837?|Lee?|F L?|5963 Oak St?|Hector?|NY?|14841?|700?|2? |489.50?|.50 583990?|Abraham?|M T?|392 Mill St?|Isle?|MN?|56342?|9999?|3? |500.00?|.00 Let’s break that SED command down into components. sed 's/ */?|/g' custcdt.temp > custcdt.txt SED reads custcdt.temp and writes the output to custcdt.txt. The single greater-than symbol tells the system to replace the data in file custcdt.txt if it already exists. The substitution expression is messy, so let’s work through it. The part between the first and second forward slashes is the search value. The blank followed by an asterisk means one or more blanks. The two backslashes mean one backslash. (The first backslash is an escape character.) The part between the second and third forward slashes is the replacement value–a question mark and a vertical bar. The “g” following the last slash stands for “global”. That is, replace all occurrences of the search string, not just the first one of each line. Running the Commands To put it all together, embed the CPYTOIMPF and QSH commands in a CL program. Here’s how the second example would look: PGM ... More stuff ... CPYTOIMPF FROMFILE(QIWS/QCUSTCDT) + TOSTMF('custcdt.temp') + MBROPT(*REPLACE) RCDDLM(*CRLF) STRDLM(*NONE) FLDDLM('') QSH CMD('sed ''s/ */?|/g'' custcdt.temp > custcdt.txt') ... More stuff ... ENDPGM Notice the doubled apostrophes in the regular expression of the SED command.
|