Conditional Counting with Open Query File
June 20, 2007 Hey, Ted
Is there a way to make Open Query File (OPNQRYF) count records that meet certain criteria? I want to write a summary query, using OPNQRYF, that counts all records, and that also counts only the records that have a positive value in one of the numeric fields. Can OPNQRYF do anything like that? –Dave When I got Dave’s question, I told him that I thought it could be done, but I’d have to play with it to figure it out. A little while later, Dave had his query and I had another technique to share with you. It would be nice if OPNQRYF had an %IF function to carry out conditional calculations. However, I seriously doubt IBM will ever add such a function. (Then again, I was wrong about CL.) So, we’re back to tricks again, which is not the way I like to write programs. To illustrate the technique Dave implemented, let’s assume an input file that has an association ID field, ASSNID. Dave wants a full count of the file and a count of all records where ASSNID is greater than zero. Since Dave is running a summary query, he needs a pattern file that contains the layout of the record that OPNQRYF is to build. This file contains no data, and does not even have to have a member. Let’s call it WORK. A R REC A FULLCOUNT 5P 0 A ASSNCOUNT 5P 0 Dave has some data to count. Let’s call it DATA. KEY ASSNID === ====== 1 24 2 31 3 0 4 1- 5 2- 6 3 7 2 8 0 9 1 10 5 Here’s the OPNQRYF he used to count his records. opnqryf file((DATA)) format(WORK) + mapfld((FullCount '%count') + (Assn1 '%max(AssnID 0)') + (Assn2 'Assn1 / %max(AssnID 0.1)') + (AssnCount '%sum(Assn2)')) cpyfrmqryf data tofile(qtemp/wrk) crtfile(*yes) mbropt(*replace) clof data FULLCOUNT is straightforward. It uses the %COUNT function to count all the records. Let’s look at how ASSNCOUNT is loaded. First, work field ASSN1 is assigned the larger of two values–ASSNID and zero. If ASSNID is negative, ASSN1 becomes zero. Otherwise ASSN1 takes the value of ASSNID. Next ASSN2 is calculated as ASSN1 divided by the larger of ASSNID and one-tenth. If ASSNID is positive, ASSN2 is ASSN1 divided by ASSNID, which yields the value one. If ASSNID is negative or zero, ASSN2 takes the value zero divided by one-tenth, which is zero. ASSNCOUNT adds up these ones and zeros. I know that’s not intuitive, so you may want to work through a positive, negative, and zero ASSNID. Here is an example of what Dave would get from the data set given above. FULLCOUNT ASSNCOUNT 10 6 Dave’s question got me to wondering about the feasibility of this technique. It’s purely academic interest. Suppose his file also had a one-character CODE field, and I wanted to count the records that had a code A. First, I would need to add a counter field for the code A records to the work file. A R REC A FULLCOUNT 5P 0 A ASSNCOUNT 5P 0 A COUNTA 5P 0 And there would need to be a code field in the data. KEY CODE ASSNID === ==== ====== 1 A 24 2 B 31 3 A 0 4 A 1- 5 C 2- 6 A 3 7 B 2 8 D 0 9 D 1 10 D 5 Here’s what I came up. opnqryf file((DATA)) format(WORK) + mapfld((FullCount '%count') + (Assn1 '%max(AssnID 0)') + (Assn2 'Assn1 / %max(AssnID 0.1)') + (AssnCount '%sum(Assn2)') + (A1 '%or(%xor(%min(%xor(Code ''A'') X''01'') x''01'' ) x''F0'')' + *char 1) + (A1d 'A1' *dec 1) + (CountA '%sum(A1d)')) cpyfrmqryf data tofile(qtemp/wrk) crtfile(*yes) mbropt(*replace) clof data A1 uses logical functions to create a one or zero, depending on the value of CODE. To understand how it works, work from the inside out. Here’s the case when CODE is A. Expression Binary ============== ========= Code 1100 0001 A 1100 0001 XOR 0000 0000 X'01' 0000 0001 MIN 0000 0000 X'01' 0000 0001 XOR 0000 0001 X'F0' 1111 0000 OR 1111 0001 Binary 11110001 is hexadecimal F1, which is the character 1. All records that have a code of A generate a one. Here’s how the expression works when code is B. Expression Binary ============== ========= Code 1100 0010 A 1100 0001 XOR 0000 0011 X'01' 0000 0001 MIN 0000 0001 X'01' 0000 0001 XOR 0000 0000 X'F0' 1111 0000 OR 1111 0000 Binary 11110000 is hex F0, which is the character zero. All records that do not have a code of A generate a zero. A1D converts the character result of the logical operations to a decimal one or zero. COUNTA adds the ones and zeros. Here are the results. FULLCOUNT ASSNCOUNT COUNTA ========= ========= ====== 10 6 4 To count a different value, change the ‘A’ in the innermost parentheses. For example, here’s how to count records with code Z. '%or(%xor(%min(%xor(Code ''Z'') X''01'') x''01'' ) x''F0'')' Let me end with a few editorial comments. First, I’m happy that Dave got his program to work. Second, I did not suggest to him, but I’ll suggest here, that SQL would offer a much better choice. select count(*), sum(case code when 'A' then 1 else 0 end) as ACount from data Unfortunately, I find that many shops don’t have SQL. Which brings me to my third comment. Third, I find that many shops are still writing the same style of code that was being written 20 years ago. Not long ago, I was talking to a developer who had been told, “You can read that Guru newsletter, but you can’t do any of the stuff they talk about in it.” No subprocedures. No SQL. I look at code like the mapped field expressions in the examples above and I groan. I like everything to be straightforward, as obvious and as intuitive as possible. But as long as some shops remain in the Dark Ages, I’ll keep trying to publish techniques to help them get their jobs done. Have a nice day. –Ted
|