Parameter Passing and Performance
June 20, 2007 Ted Holt
There are three ways to pass a parameter to a procedure: by reference, by value, and by read-only reference. These methods are not interchangeable, and passing parameters by value can have unfavorable effects on performance. In the following paragraphs, I explain why I make such a statement, I show you how to define parameters for performance, I list the performance figures from my testing, and I provide some recommendations for parameter passing. Before I go any further, let me get a couple of things out of the way. First, keep in mind that I am talking about you and the routines you write. If you call someone else’s routine, you have to follow whatever convention they used. If they decided that all the parameters were to be passed by value, you’ll have to pass them by value as well, no matter how you feel about it. Second, I thank Barbara Morris of the RPG compiler team for answering questions I put to her regarding the passing of parameters. Any erroneous conclusions I may have reached are my own fault, of course. Let’s say you’ve decided to write a subprocedure to be used in ILE programs. Language doesn’t matter, so I’m going to use RPG for the examples. You name your subprocedure DoIt, and it requires one parameter to do its job. Which method are you going to use to pass the parameter to the subprocedure? By reference? P DoIt b D pi D Type 2a By value? P DoIt b D pi D Type 2a value By read-only reference? P DoIt b D pi D Type 2a const First, let’s consider passing by value. The VALUE keyword on the parameter definition means that a caller routine makes a copy of the data that it wants DoIt to accept as a parameter. DoIt operates on the copy, rather than on the data itself. Look at the following example. H dftactgrp(*no) actgrp(*new) D ItemTypeCode s 2a D DoIt pr D Type 2a value /free *inlr = *on; DoIt (ItemTypeCode); return; /end-free * ========================================= P DoIt b D pi D Type 2a value /free // do some stuff return; /end-free P e The main routine invokes DoIt, passing a copy of ItemTypeCode in the first (and only) parameter. DoIt refers to the copy of ItemTypeCode as Type. If DoIt changes Type, the change occurs to the copy of ItemTypeCode, not to ItemTypeCode itself. (I do not like this ability to change parameters that are passed by value. I see assignments to such parameters as misleading, since it appears that a data value in the caller is changed, when in fact it is not changed. But this preference of mine is not applicable to the question of performance that I am attempting to address.) Next, let’s consider read-only reference. Here’s the same example. H dftactgrp(*no) actgrp(*new) D ItemTypeCode s 2a D DoIt pr D Type 2a const /free *inlr = *on; DoIt (ItemTypeCode); return; /end-free * ========================================= P DoIt b D pi D Type 2a const /free // do some stuff return; /end-free P e The CONST keyword tells the caller to pass the address of ItemTypeCode to DoIt. However, DoIt is not allowed to modify the data it refers to as Type. Obviously this method doesn’t work if the subprocedure must change the parameter. In that case, you’d omit both VALUE and CONST keywords, which would be passing the parameter by reference. H dftactgrp(*no) actgrp(*new) D ItemTypeCode s 2a D DoIt pr D Type 2a /free *inlr = *on; DoIt (ItemTypeCode); return; /end-free * ========================================= P DoIt b D pi D Type 2a /free // do some stuff return; /end-free P e The main routine provides DoIt with a pointer to (i.e., the memory address of) ItemTypeCode. Anything that DoIt does to Type is really done to ItemTypeCode. The first thing to consider, then, is whether or not the parameter is to be modified. If a subprocedure is to be able to modify the parameter, you must pass the parameter by reference. But if a subprocedure does not modify a parameter, it’s better to pass the parameter by value or by read-only reference. These last two parameter-passing mechanisms have two advantages over passing by reference.
Does it matter, then, whether I pass parameters by value or pass by read-only reference? From a performance standpoint, I knew it must. After all, the system can allocate memory for a pointer much more quickly than it can allocate memory for a large character variable. I ran a few tests to get an idea of how the passing of parameters affects performance. The times given in the following tables are CPU seconds, as reported in the job log. This is hardly scientific, but plenty good enough for our purposes. In the first version of my test program, I passed a 64-byte variable-length character to a subprocedure. I repeatedly invoked the subprocedure within a loop. H dftactgrp(*no) actgrp(*new) D BigString s 24a varying D inz('I like cheese.') D Index s 10u 0 D Limit s 10u 0 D inz(500000) D DoIt pr D inString 64a varying const /free *inlr = *on; for Index = 1 to Limit; DoIt (BigString); endfor; return; /end-free * ============================================================= P DoIt b D pi D inString 64a varying const /free return; /end-free P e The following table shows execution times for various numbers of iterations.
Table 1: Invoking a subprocedure that accepts a 64-byte variable-length character parameter. Does it seem to make any difference which method you use? I changed the parameter length from 64 bytes to 64 kilobytes (65535 bytes) and reran the tests. The next table looks a bit different from the previous one.
Table 2: Invoking a subprocedure that accepts a 64 kilobyte variable-length character parameter.
Hmmm, passing by value doesn’t look so good anymore, does it? Think about what’s happening. Each time the caller invokes the subprocedure, it must first allocate memory for the parameter. When passing by read-only reference, it must allocate enough memory for a pointer, which is a matter of bytes. But when passing by value, it must allocate 64 kilobytes. Upon return to the caller, the system deallocates the memory. Allocating and deallocating 64 bytes is no big deal, but allocating and deallocating 64 kilobytes is. I tried another test. This time, I passed an expression, rather than a scalar variable, to the subprocedure. In this case, the system has to do a bit more work before calling the subprocedure. Instead of passing a pointer to a variable, the system must evaluate the expression. H dftactgrp(*no) actgrp(*new) D BigString s 65535a varying D inz('I like cheese.') D Index s 10u 0 D Limit s 10u 0 D inz(500000) D DoIt pr D inString 65535a varying const /free *inlr = *on; %len(BigString) = 65535; for Index = 1 to Limit; DoIt (%subst(BigString:32768:32768)+%subst(BigString:1:32767)); endfor; return; /end-free * ============================================================= P DoIt b D pi D inString 65535a varying const /free return; /end-free P e The next table shows how passing by read-only reference and passing by value handled the extra overhead.
Table 3: Invoking a subprocedure, passing a 64-kilobyte expression as the first parameter. Runtime went up–way up. However, it went up much more when the parameter was passed by value. What did I learn from my tests?
There is more to this performance issue than what I have covered here. I will get back to you with more information, maybe even by next week.
|