Guru: An Introduction to Processing XML With RPG, Part 2
August 29, 2018 Jon Paris
In the first part of this series for Guru Classic, I introduced you to the basics of using RPG’s XML-INTO op-code. In that tip I showed how the provision of a count provided by RPG in the PSDS can be used to determine how many of a repeating element were processed.
However, as I noted at the time, this can only be used when handling a repeating outer element. But what if there is a repeating element within each of those outer elements? In this second part of the series we will be studying how to handle those situations.
Take a look at the sample XML below:
<Customer Id="S987654"> <Name>Smith and Jones Inc.</Name> <Address> <City>Jonesboro</City> <State>GA</State> </Address> </Customer> <Customer Id="B012345"> <Name>Brown and Sons</Name> <Address Type="Shipping"> <City>San Jose</City> <State>CA</State> </Address> <Address Type="Mailing"> <City>San Francisco</City> <State>CA</State> </Address> </Customer> ...
As you can see it is very similar to the examples we used in part 1. I have just taken it to the next level. First I have allowed for a customer to have more than one address. In addition each individual </Address> element can now have a Type attribute associated with it to indicate whether it represents a mailing or shipping address. I demonstrated how to handle attributes in part 1, but this time I am also allowing for the Type to be omitted, in which case the program logic will default it to “Mailing”. You can see the changes I have made to the data structures below. Note in particular that I have allowed for a maximum of 10 addresses per customer.
Dcl-ds customer Dim(99) Qualified; id char(7); name char(40); dcl-ds address Dim(10); type char(8); state char(2); city char(40); End-ds; End-ds;
If you were to leave the rest of the program unchanged it would compile just fine, but if you tried to run it you would get the error: “The XML document does not match the RPG variable; reason code 2.” And if you were to press F1 on this message you would see that the problem arose when processing the address portion of the first customer.
So why did this happen? If you remember, in part 1 I said that the names and hierarchy of the fields within the variable that you use to receive the data must match those in the XML document. What I didn’t point out then was that the rules for what is considered a “match” are very strict and by default:
- Allof the fields in the XML must be accommodated
- Allof the fields in the DS must be contained within the XML document
That is to say that there must be an exact one for one correspondence between the XML document and each and every in field in the DS. To put it another way, the XML cannot contain an element for which there is no “home” in the DS, and there must be an XML element to supply data for every occurrence of every field in the DS.
Remember the Type attribute? If you study the XML you will see that I omitted it for the first customer. This is the data that RPG considers to be missing. So one possible solution would be to modify the XML to include a type for the second customer. That would fix things, right? Sadly, as those of you who rushed ahead and tried have discovered, the answer is no. RPG still thinks that there is data missing. What data? Remember I said that by default the XML has to supply “data for each and every occurrence”? We told the compiler that there would be 10 occurrences of the address element, but in our test data we don’t have that many. As a result RPG considers the rest to be “missing”.
The compiler folks anticipated this issue and provide a processing option that allows us to deal with this. The keyword for this is allowmissing and if we give it a value of yes then RPG will not cause an error to be thrown under these circumstances. This is what our XML-INTO statement looks like in sample program XMLINTO3:
XML-INTO customer %XML( xmlSource: 'case=any allowmissing=yes');
If we use this option then we must test for a blank field in the address array to determine when we have reached the end. Similarly we can also test for a blank Type field and set the default value when one is encountered. You can see the resulting logic from XMLINTO3 here:
for i = 1 to count; Dsply ( 'Id ' + customer(i).Id + ': ' + %TrimR( customer(i).name )); a = 1; // Loop through addresses DoU ( customer(i).address(a).city = *Blanks ); // Default blank address type to Mailing If ( customer(i).address(a).type ) = *Blanks; customer(i).address(a).type = 'Mailing'; EndIf; Dsply ( customer(i).address(a).type + ': ' + %TrimR(customer(i).address(a).city ) ); a += 1; EndDo; EndFor;
So, what is wrong with this solution? Basically there is no granularity. Ideally I would like to be able to say that it is OK if the type attribute and some of the address elements are missing. But if a customer name is missing then that indeed should be an error. But the minute I say allowmissing=yes then anything goes. The XML document could effectively be completely empty and RPG would say “looks good to me!”
Clearly that is not satisfactory and there has to be a better way. And indeed there is an alternative that gives us the control we need. The option that controls this is countprefix and it allows us to specify the name of a field into which the compiler can place a count of the number of repeating items loaded. In other words we can specify a count for arrays at any level, in addition to the one that RPG supplies in the PSDS. The critical thing to note here is that RPG will not consider as “missing” any element that can be counted in this manner and so no error is generated.
This is the modified XML-INTO used in program XMLINTO4:
XML-INTO customer %XML( xmlSource: 'case=any countprefix=count_' );
Now all we have to do is to modify our data structures to include count fields wherever we need them. Here’s the result:
Dcl-ds customer Dim(99) Qualified; id char(7); name char(40); count_address int(5); dcl-ds address Dim(10); count_type int(5); type char(8); state char(2); city char(40); End-ds; End-ds;
Since we want to count the number of address elements loaded, we create a count field by prefixing the element name with our chosen prefix. So the name of the field in this case is count_address and it must be specified at the same hierarchal level as the item it is counting. If you are paying attention, you may have also noticed that I added a field count_type to the address DS definition. The countprefix support not only allows us to count the number of elements loaded into an array, but it can also be used to check the presence or absence of any optional element. So we can use it to determine if the type attribute was present or not, rather than have to test for blanks.
Courtesy of these two new count fields we can now simplify the processing logic to take advantage of them like so:
For i = 1 to count; Dsply ( 'Id ' + customer(i).Id + ': ' + %TrimR( customer(i).name )); // Loop through addresses For a = 1 to customer(i).count_address; // Default address type to Mailing if not supplied If ( customer(i).address(a).count_type ) = 0; customer(i).address(a).type = 'Mailing'; EndIf; Dsply ( customer(i).address(a).type + ': ' + %TrimR(customer(i).address(a).city ) ); EndFor; EndFor;
This is by no means a comprehensive introduction to the processing of XML with RPG’s built-in support. There are many more features that you may need from time to time and in the next edition of Guru Classic, I will explore some of those.
Jon Paris is one of the world’s foremost experts on programming on the IBM i platform. A frequent author, forum contributor, and speaker at User Groups and technical conferences around the world, he is also an IBM Champion and a partner at Partner400 and System i Developer. He hosts the RPG & DB2 Summit twice per year with partners Susan Gantner and Paul Tuohy.
RELATED STORY