Guru: Web Services, DATA-INTO and DATA-GEN, Part 3
July 12, 2021 Jon Paris
In this tip I’m going to discuss some of the options used with DATA-GEN and DATA-INTO to deal with the fact that element names in JSON and XML frequently contain characters that are not legal in RPG names.
This is important because RPG’s -INTO and -GEN operations rely on names to map elements. So if the document we are processing uses names that are not legal in RPG how can we handle that? As you will see, when dealing with -INTO operations they are quite easily handled. DATA-GEN, on the other hand, presents a different problem and we will get to that later.
Actually, the rules that apply to RPG names that are used with the -INTO operations are even more restrictive than RPG’s regular rules. In fact, the only characters allowed in names are the letters A-Z, the numbers 0-9, and the underscore. I’m not quite sure why the rules are so restrictive, but they are.
This story contains code, which you can download here.
Let’s take a look at part of the JSON document that I will be processing in this example:
"Customer Count": 3, "Customer Data": [ { "Customer Id": 12345, "Customer Name": "Jacob Two-Two", "Customer Address": { "Line #1": "123 Any Street", "Line #2": "Streetsville", "City": "Mississauga", "State/Province": "ON", "Zip/Post Code": "L5Q 2T8 } } ...
According to the rules for variable names we have a couple of problems here. First, there are a number of elements that have a space in their names (e.g. “Customer Count” ), and second there are “#” signs as well as a space in two of the element names (e.g. “Line #1” ).
The -INTO operations provide a processing option that will easily handle these situations. It is “case=convert”. When used it tells RPG to modify the element names before attempting to match them to the RPG variable names. Here’s a simplified version of the complete process:
- Convert lowercase letters to uppercase
- Convert accented characters to their uppercase equivalent. So à, è, and î become A, E, and I.
- Convert remaining “illegal” characters to underscores, and consecutive underscores to a single underscore
- If the name now begins with an underscore, remove it.
Using these rules the basic DS for the Customer elements in this JSON would end up looking like this:
Here are the full DS definitions that will be used in the example:
Dcl-DS customer_T Template Inz Qualified; <A> customer_Id char(5); customer_Name varchar(40); Dcl-DS customer_Address; <B> line_1 varchar(50); <C> num_line_2 int(5); // Line 2 is optional line_2 varchar(50); city varchar(50); state_Province varchar(30); <C> num_zip_Post_Code int(5); // And so is zip/post code zip_Post_Code varchar(10); End-DS; End-DS; // DS to receive response data extracted by DATA-INTO Dcl-DS responseData Qualified Inz; customer_Count int(5); num_customer_Data int(5); Dcl-DS customer_Data LikeDS(customer_T) Dim(99); End-DS;
You can see at <A> that replacing the space in the name “Customer Id” resulted in the name customer_Id. Similarly at <B> the space and the # characters in the name “Line #1” were replaced by a single underscore resulting in the name Line_1. Other element names followed the same pattern. I should point out here that I deliberately used my regular camel case naming convention with a lower case first letter when defining these variables. More on this later.
One other thing that you might have noticed is that I have introduced a couple of elements whose name begins with num_ <C>. I am using these to handle the fact that two of the JSON elements (“Line #2” and “Zip/Post Code”) are optional. I could deal with this by specifying to DATA-INTO that missing elements are allowed by specifying the option allowmissing=yes but there is a potential problem with that. It would allow _any_ element to be missing, not just the ones I know to be optional. By using the num_ fields together with the option countprefix=num_ I can tell RPG to count how many of the named element are present. The count will be 1 if the element is present and 0 if it was omitted. Now if any other element(s) are missing, RPG will throw an error, but it is content if these elements are missing because I have given it the ability to inform me of their status. You can see how the options are specified in the code extract below:
Data-Into responseData %Data( '/Home/Paris/JSONStuff/CustomerData.json' : 'case=convert countprefix=num_ doc=file ' ) %Parser( 'YAJL/YAJLINTO' );
If you want to know more about this option and its uses, check out this tip from 2018 where I covered it in relationship to XML-INTO — it works exactly the same way with DATA-INTO.
If you want to dig into this example in greater depth you can download the code package here.
Generating JSON Elements With Non-RPG Names
Suppose that we wanted not just to consume these kinds of JSON elements but also to generate them? Well DATA-GEN is the obvious choice but how do we handle the requirement for non-RPG names?
Let’s start by taking the DS that we just populated with DATA-INTO (responseData) and simply use DATA-GEN with that as its source and see what happens. Here’s the DATA-GEN operation to do that:
Data-Gen responseData %Data( '/Home/Paris/JSONStuff/CustomerDataOut.json' : 'countprefix=num_ doc=file' ) %Gen( 'YAJL/YAJLDTAGEN' : '{ "beautify" : true }' );
Looking at this you may wonder why I specified the %Data option countprefix=num_ again. If you guessed that it controls whether optional elements are output or not, congratulations! Without it all elements in the DS would be output, but with it any element associated with a zero count will not be output.
This will produce a JSON file that contains the same basic data as the original, but there is a problem with the element names because DATA-GEN uses your RPG field names as the JSON element names. As a result the generated file would look something like this:
{ "customer_Count": 3, "customer_Data": [ { "customer_Id": "12345", "customer_Name": "Jacob Two-Two", "customer_Address": { "line_1": "123 Any Street", "city": "Mississauga", ...
Note that without the { “beautify” : true } option to the generator the JSON would not look this pretty. For production purposes you should omit it, but when developing it makes it much easier to see what you are generating.
It probably won’t surprise you that all of the underscores are still there in the names. RPG cannot magically work out what the original names were. We will look at how to deal with this in just a moment. The second thing to notice is that even when there is no underscore involved, such as the “city” element. The name is still wrong because in JSON, unlike RPG, the case of a name matters. In RPG CITY, city, and City are all the same variable but not in JSON. In JSON those are three different names!
So when coding any DS that will be used by DATA-GEN the first thing to remember is that you must forget your own personal coding standards and use the exact spelling of the element name wherever you can. This is about the only time in an RPG program where it matters how you type the name! DATA-GEN will use whatever you type. In this case, for example, I should have used the name City and not city.
So, what about those non-RPG names? DATA-GEN provides an option to deal with this and it allows you to supply a name that will be used in place of the RPG name when writing the data. This option is renameprefix and, as you may have guessed from its name, it is specified in a similar manner to the countprefix option. For each element in the output requiring a different name from that of the RPG variable, you simply provide an additional variable containing the name to actually be used. This sounds a lot more complicated than it is so let me show you how it works.
< ... <D> name_customer_Data varchar(30) inz('Customer Data'); Dcl-DS customer_Data Dim(20); <D> name_customer_Id varchar(30) inz('Customer Id'); customer_Id zoned(5); name_customer_Name varchar(30) inz('Customer Name'); customer_Name varchar(40); name_customer_Address varchar(30) inz('Customer Address'); Dcl-DS customer_Address; name_Line_1 varchar(30) inz('Line #1'); line_1 varchar(50); num_Line_2 int(5); // Line 2 is optional name_Line_2 varchar(30) inz('Line #2'); Line_2 varchar(50); City varchar(50); ... Data-Gen outputData %Data( '/Home/Paris/JSONStuff/CustomerDataNew.json' <E> : 'doc=file countprefix=num_ renameprefix=name_' ) %Gen( 'YAJL/YAJLDTAGEN' : '{ "beautify" : true }' );
I’ve highlighted a couple of the name variables needed for this example <D>. In each case, as RPG builds the output stream, the content of the name field is used as the element name in the JSON (or whatever else you are generating) and the associated value comes from its matching pair. So name_customer_Id supplies the name for the data contained in customer_Id and so on.
At <E> you can see the renameprefix option in action. The actual value of the prefix (i.e. the characters following the = sign) can be anything you like, but name_ seemed appropriate to me. To make life simpler and avoid confusion in your shop I would suggest that you agree on a standard name to be used in all cases.
In the accompanying code bundle for this tip I have included a program that reads the original JSON, updates a couple of the elements, and writes out a new version with DATA-GEN. If you study the code you will see that it highlights one drawback to the way that DATA-GEN handles renaming. In cases where you are updating the JSON, it almost forces you to use a second DS and to copy the data (I used EVAL-CORR) from one structure to the other. You could use the DS with the name_ variables but then you’d have to use allowmissing=yes to avoid errors since there will never be data for those fields.
Next Time
So far all of our DATA-GEN efforts have built the JSON stream with a single operation. But this is not always possible or practical. In some cases it would also be easier to build the JSON in pieces. Luckily DATA-GEN provides facilities for that and that will be the subject for the next tip in this series.
Jon Paris is one of the world’s foremost experts on programming on the IBM i platform. A frequent author, forum contributor, and speaker at user groups and technical conferences around the world, he is also an IBM Champion and a partner at Partner400 and System i Developer.
RELATED STORIES
Web Services, DATA-INTO and DATA-GEN, Part 1