z17 Mainframes Give IBM Time To Ramp AI-Accelerated Power11 Systems
April 14, 2025 Timothy Prickett Morgan
We had been expecting for the Power11 processors and their new Power Systems servers to be announced sometime around the spring to early summer of this year and to start shipping in volume in the summer, maybe in June or July, with a nice sales bump in the second half of 2025. However, now the Power11 launch will be in the second half of the year, and IBM’s Systems group will be counting on a bump first from the System z17 mainframes that were launched last week.
This has happened before, and more than once, and it is a good thing.
Most recently, the System z15 mainframes were announced in September 2019, ahead of the Power10 chips that were pushed out into September 2021 and July 2022 for big iron and entry machines. We think that Power10 was supposed to be out in 2020, and it is a good thing they were not because IBM and the rest of the world were dealing with the coronavirus pandemic that year. So the delay caused by GlobalFoundries pulling the plug on its 10 nanometer and then 7 nanometer chip manufacturing processes, which allowed IBM to basically revamp the Power10 design from a many-core design similar to Power9 to one with a new implementation of the Power instruction set and much beefier cores suitable for AIX and IBM i workloads, actually all worked out for the best. So much work was done that IBM is just doing a bunch of tweaks to the 7 nanometer processes Samsung uses to etch the chips and to the features on the socket to create Power11.
When the System z mainframes are revamped, there is a big bump in revenues and profits at IBM, and this takes the heat off Power Systems for a few quarters. And given the state of geoecopolitics at the moment, and the fact that Power11 chips are made at the Samsung foundry in Hwaseong, South Korea and the Power Systems servers are made in Guadalajara, Mexico, perhaps this is a good time for Power Systems to rethink making its entry machines back at home in the United States or getting a waiver from the Trump administration for tariffs that have been imposed in those (and many other) countries.
I covered the “Telum” z16 mainframe processor in The Next Platform when it was launched back in August 2021 (again, ahead of a Power Systems launch, which provided revenue air cover), and I also drilled down into the “Telum II” z17 mainframe processor back in August 2024 at The Next Platform as well when IBM provided some feeds and speeds on the CPU as well as the “Spyre” AI accelerator that will be paired with mainframes and, eventually, Power Systems machines to do AI inference at scale. We got a briefing here at IT Jungle from IBM about the Power11 processor back in November last year, and followed up in December with a look at the memory subsystem coming with the Power11 machines. And then did a drilldown in the Spyre chip here at IT Jungle as it relates to AI efforts that Big Blue is undertaking to make life easier for IBM i shops, and the statement of direction it put out saying it would eventually integrate Spyre accelerators into Power Systems machines.
The Spyre accelerator is a bigger and badder version of the matrix math engine that was put into the Telum and Telum II processors, and is distinct from the Matrix Math Acceleration (MMA) engines incorporated in the Power10 cores. The Power10 has a pair of MMA units that each have two 512-bit matrix math units, which can drive 2,048 bits of multiplication per clock cycle. They work in 32-bit and 64-bit floating point precision as well as reduced precision using FP16 and BF16 formats. It would not be surprising to see FP8 quarter-precision added in Power11, and FP4 precision added in Power12.
The point is Spyre uses the same kind of matrix math units as Telum and Telum II have on their processors, all of which are unique from the MMA and vector units in the Power10 processor and, presumably, also different from the Power11 processor, too. It would be nice, from a software standpoint, if IBM had added the same matrix math units to both Power and z processors, and then built bigger accelerators that are compatible with it for additional processing capacity.
In announcement letter AD25-0015, the z17 mainframe systems were launched, which have enough on-chip matrix math performance to handle 5 million inference operations per second with less than 1 millisecond response time on a logical partition with two Telum II cores activated with 128 GB of main memory and four zIIP accelerator cores. Each Telum II chip has eight cores, plus an integrated DPU I/O processor and a matrix accelerator. Two of these chips are put into a socket for 16 cores, four of the sockets are put on a system board for 64 cores, and four drawers can be lashed together to create a machine with 256 physical cores and 32 matrix accelerators. A maximum of 208 of the cores can be used for processing, and others can be designated as zIIPs, zAAPs (for accelerating Java), IFLs (for Linux), I/O processors, or other kinds of uses.
The design is not all that different, in concept, to the Power E980 and Power E1080, but the internals of the sockets and how they are configured and used are different. Both the current z17 mainframe and the Power11 systems use the next generation differential DIMM memory cards from IBM, nicknamed “Odyssey” and based on DDR5 memory chips and the “Centaur” memory buffer and controller. These cards are the follow-on to the “Explorer” memory cards used in Power10 and z16 servers and based on DDR4 chips. (We wrote about these memory cards back in December 2024.) These Odyssey memory cards offer 2X the density and 3X the bandwidth per CPU socket compared to the Explorer cards.
The thing to remember is that as long as IBM is throwing off lots of cash with its System z hardware, software, and services, it can afford to ride through Power processor transitions like the one we are in now. And everything that it learns with its System z mainframe customers can be applied to the Power Systems platform – like the use of on-chip matrix units or the PCI-Express Spyre accelerator cards.
Which brings us to the point. The Spyre cards are going to be available in Q4 2025 for the z17 mainframes, but we do not really know when they will be paired with the Power11 machinery as yet. We do know that the code assistant for IBM i’s RPG language, which is under development, is not yet ported to run its inference on the Spyre cards, but that is presumably in the works. The code assistant for COBOL and Java on System z mainframes has been tweaked to run on Spyre, but this is easy considering that this is the same style of matrix unit used in the Telum and Telum II processors.
It would be better if these cards were already shipping for both Power and System z machines. And it would be best if they were shipping for Power Systems on Day One when the Power11 machines launch. They should be integrated from the get-go, and there should be an AI software stack like IBM has created for the System z mainframes – what it is now calling the IBM Z, which we suppose is akin to calling the other platform the IBM i – ready for the Power Systems machines using both the integrated MMA units on the Power10 and Power11 chips and on the Spyre accelerators from the get-go, too.
Why should Power Systems have to wait? We got the distinct impression that Spyre accelerators would be available as a service on the Power Virtual Server cloud and only eventually would be available on premises when we talked to Bargav Balakrishnan, vice president of product management for Power Systems, back in January. But code assist for IBM i will be run locally, as will many AI routines. This will certainly be true in any heavily regulated industries like financial services and healthcare. IBM should make enough Spyre cards so Power10 and Power11 customers can get them in Q4 2025, too. It’s that simple. Capture the AI opportunity and make it integrated and seamless, which is not what would happen if IBM shifted customers away from the Watson.X AI tools and towards Nvidia’s GPUs and AI Enterprise stack.
RELATED STORIES
Plotting Out Power Systems And IBM i To 2040 And Beyond
Talking Power Systems And IBM i With Bargav Balakrishnan
Power11 Takes Memory Bandwidth Up To, Well, Eleven
IBM Raises The Curtain A Little On Future Power Processors
Power10 Keeps Plugging Along As Power11 Looms For 2025
RPG Code Assist Is The Killer App For AI-Enhanced Power Systems
IBM Previews New Power Tech At TechXchange Event
GenAI Interest ‘Exploding’ for Modernization on IBM i and Z, Kyndryl Says
IBM Shows Off Next-Gen AI Acceleration, On Chip DPU For Big Iron (The Next Platform)
IBM’s AI Accelerator: This Had Better Not Be Just A Science Project (The Next Platform)
Some Thoughts On Big Blue’s GenAI Strategy For IBM i
IBM Bets Big On Native Inference With Big Iron (The Next Platform)