What Is SMIOSTCPGT and Why Is It Eating My System?
June 20, 2007 Hey, Joe
We’ve been doing some performance analysis on our i5/OS V5R3 system and we’ve discovered that the SMIOSTCPGT process is producing an incredible number of synchronous disk operations, which could be slowing down system performance. What is SMIOSTCPGT and what can we do to stop it from bogging things down? –Joe Before I start this tip, I’ll come clean and admit that this question came from my own recent experience and it’s a problem I’m trying to find a solution for. I’m presenting the problem here as documentation for other administrators experiencing the same issue and to gather more information on possible fixes. I discovered the SMIOSTCPGT problem when users on a relatively new partition started complaining that their transaction response time was becoming excessive, sometimes taking over a minute between the time they press the ENTER key or a function key and the time a response is returned from the system. We couldn’t pinpoint an exact cause of our problem, as there wasn’t any evidence that the usual suspects (inadequate CPU and memory; runaway or hoggish jobs; disk drive arm and imbalance issues, etc) were behind the issue. Our page fault rates were within system guidelines, automatic performance tuning was turned on and it seemed to be working effectively, and we had segmented all of our subsystems so that each subsystem had its own storage pool and activity levels. To all appearances, it appeared the system should have been running great. With some outside consultants, we ran a more in-depth performance analysis and found the SMIOSTCPGT situation, where this process was producing an incredible number of synchronous disk I/O. High synchronous I/Os are dangerous to a system because records are read directly into main storage one record at a time. The records aren’t cached or retrieved in groups, as happens with asynchronous reads. When a transaction processes a synchronous disk I/O, the user must wait for the disk operation to complete before it can continue processing. And if a number of users are waiting for synchronous I/Os to complete, it can slow down a good portion of your workload. Because of this, high disk I/Os will have a negative effect on transaction response time. With a possible cause and a likely culprit, we went out in search of an answer. Unfortunately, you can’t find much documentation on SMIOSTCPGT, since Google only returns four links that provide fairly worthless information. I turned to IBM tech support and here’s what they told me. SMIOSTCPGT is a storage management static paging service task in the licensed code, and it can be associated with system hang-ups and waits. This problem is documented in APAR MA34627, LIC SYSTEM WAIT / HANG DURING NORMAL OPERATIONS. According to the APAR, this problem has been identified in i5/OS and OS/400 releases going back as far as OS/400 V4R3, but the APAR only identifies the following PTFs for i5/OS V5R3 and above.
These PTFs change the way that static paging tasks are used and they should fix any hanging problems with SMIOSTCPGT. Besides fixing the hanging problems, IBM says that PTF application may also improve your performance in static paging scenarios where the paging task is overloaded. It’s important to know that, for V5R3 at least, the PTF is not currently included on the cume tape so it has to be ordered as an individual PTF. You should also be aware that you will have to IPL your system in order to apply the fix. And since you need to perform an IPL anyway, you might as well order and apply the latest i5/OS cumulative fix pack (cume) at the same time, which also requires a system IPL. By combining your SMIOSTCPGT PTF with the latest cume CDs, you can use the SMIOSTCPGT PTF as an excuse to bring your system up to the latest cume level. Getting back to my system slowdown, I’ll soon be applying the SMIOSTCPGT PTF to determine if that makes things any better. If not, I’ll look at more options for solving the problem, including I/O tracing, application program problems, and some issues that may arise from too many remote jobs accessing the database through ODBC, JDBC, and SQL. However, those subjects are for another column at another time. In the meantime, if you’ve run into SMIOSTCPGT synchronous I/O issues or if you have more information on the problem and its fixes, please feel free to email me using the Contact Us button at the top of this newsletter. I’ll print the best responses in a future column along with the results of my efforts. Hopefully, this information will help other people running up against this issue. –Joe
|