How Far We Have Come
June 26, 2017 Timothy Prickett Morgan
When the AS/400 turned 29 last week – and yes, I know the stack is called IBM i and I know that it now runs on what IBM calls Cognitive Systems based on Power processors – I happened to be on an airplane coming back from Frankfurt, Germany, after attending the International Supercomputing Conference. As it turns out, the Power9 processor married to Nvidia’s Tesla coprocessors using its “Volta” GPUs was one of the hot topics of conversation, and so was the coherent, shared memory architecture that will lash CPUs and GPUs into a shared memory space of sorts that resembles the single-level storage that has been at the heart of the System/38 and its “Silverlake” AS/400 successor.
The difference is that the coherence that IBM encoded in its CPF and then OS/400 operating systems made a giant memory space from DRAM main memory and disk drive capacity, which is a neat trick if you think about it since no other platform really ever did this. With the Power9 architecture, IBM is adding coherence between the very high speed but low capacity memory on the GPU accelerators, which have 16 GB of capacity in the Volta generation and a memory bandwidth of 900 GB/sec, about 2.5X that of a Power8 processor and about 10X that of an Intel Xeon processor, and the plain vanilla DRAM on the Power9 main memory controllers. This single level storage will be implemented in Linux, as I learned from conversations with IBM last week, with beta kernel tweaks coming later this year and then rolling out in general availability early next year. I think we can all plan for Power9 announcements in March or April is my guess.
The neat thing about this coherence is that all of the memory attached to the CPUs and GPUs will be linked together using 25 Gb/sec bi-directional links developed by IBM and Nvidia called NVLink. With the Power9, IBM also has more generic 25 Gb/sec links called Bluelink that we think are the same thing, just using a different protocol. With this architecture, it is like the Power9 chip can have up to three very high-performance math coprocessors attached directly to it in a shared memory cluster, and up to two of these can be chained to make a very powerful compute node.
This is not at all like the Silverlake processor that was launched in June 1988 and that did not have floating point math and therefore could not do a very good job running even compound interest calculations, much less the math dense kind of work that goes into supercomputing simulations or machine learning training algorithms that the Power9-Volta combo will be the best platform in the world on which to run such code.
IBM i midrange shops have reason to be proud about how its systems designers and partners have come together to create a viable alternative to Intel’s Xeon processors and even its Xeon Phi parallel X86 processors; the combo will also give Intel’s Nervana neuromorphic and Altera FPGA accelerators a run for the HPC workloads.
In a sense, the Power9 chip represents a kind of rebirth we have only seen a few times. The System/38 came out in 1979 and shipped the following year in volume, and was followed in 1983 with a simpler System/36 with a flat-file database and lower performance and a lower price tag to expand the base to smaller and midrange customers on a more modest budget. Ahead of the AS/400 launch 29 years ago, IBM had nearly 72,600 System/36 and 11,400 System/38 machines installed in the United States. Based on trends I have seen with the AS/400, the total base was probably 50 percent again as big outside of the US, and add it all up and the System/3X base in early 1988 was about the size of the IBM i system base right now.
Weird, isn’t it?
Back then, transaction processing was everything and green screens were fine. Now, every application is crammed with rich media, a slew of languages, and a distributed architecture that makes getting consistent performance a challenge. Back then, we were all younger and computing was without question a lot easier. Think about how sophisticated and ubiquitous computing is now. We have all come a long, long way on this ride, and the Moore’s Law curve has enabled us to make use of many, many orders of magnitude improvements in performance and bang for the buck to still keep this platform relevant.
It is something to be proud of, and we are. We just hope that IBM doesn’t think of IBM i as an afterthought when it rolls out the Power9 platform either later this year or early next. There is no reason at all that IBM i cannot be modernized to provide smoking, GPU-assisted performance on a shared memory cluster of CPUs and GPUs. Maybe, for once in a long, long time, we can get back to bragging about how OS/400 – well, its progeny anyway – crushes the competition. There is still time to do something more than add a footnote to a Linux cluster announcement aimed at maybe 300 potential customers in the supercomputing and hyperscale worlds and a broader announcement including IBM i shops that is relevant to hundreds of thousands of customers worldwide. I have given some ideas about how to do this, and we think you should tell IBM – through us if you want, directly if you don’t – your thoughts on this. We want to have a lot more happy birthdays with this platform.