IBM Hints at Triple Redundancy in Power6
March 30, 2006 Timothy Prickett Morgan
If you like a good riddle and lots of speculation, you probably like talking about politics, or the entertainment industry, or the future processors from chip makers. In all three cases, there is a lot more known than anyone is willing to talk about, a lot of misinterpretation and misinformation, and a lot of excitement over what will be popular and what will become yesterday’s news. IBM has been gradually revealing more details about its future Power6 processors, and last week said a few more things about it. Frank Soltis, the chief architect of the former System/38, AS/400, and iSeries server lines who also has a hand in the design of the current and future System i kickers to those machines, gave a presentation at the COMMON midrange user group in Minneapolis, Minnesota, last week, just a few hours up the road from the Rochester labs where the 64-bit PowerPC and Power4 processors were designed and where IBM makes its entry and midrange iSeries and pSeries servers, now called the System i and System p. Soltis is also a professor at the University of Minnesota’s computer science and electrical engineering department, where he teaches graduate courses in processor design. He may not be as famous as Ken Olsen or Michael Dell, but his work on virtualized systems design has had a profound impact on Big Blue and made the company untold billions of dollars. He is the closest thing to Steve Jobs that the community of 200,000 or so OS/400 user companies have to a visionary like Steve Jobs or Bill Gates. He also likes to talk, and often gets himself into hot water with IBM’s press relations people, who are always trying to muzzle him. Soltis spent a fair amount of time poking fun at Itanium, a frequent target for IBMers, and didn’t really say anything unusual except to concede that it was a conceptually interesting design that was not compatible with the existing Pentium and Xeon processors and therefore doomed. That is a radical simplification of what is going on in the Itanium market, and quite frankly, while no one would call Itanium a huge success and certainly not a volume product, Itanium has built up some momentum. And while various Power processors are being used in game machines, set top boxes, car electronics, servers, and even some desktops and laptops (at least for a short while longer), the fact is the Power4, Power5, and future Power6 chips are relegated to the server market, where their volumes are on the same order of magnitude as Itanium. Power, as a family, does better than Itanium, in terms of ecosystem and sales. But the Power6 chip may do no better than the future “Montecito” and “Montvale” dual-core Itaniums in terms of revenues and profits. Soltis also poked fun at the non-announcement of the IBM Systems Agenda last summer, and bashed the press for not covering it (the press is a convenient punching bag). The Systems Agenda is IBM’s five-year plan for its server line from here to 2010, which IBM did not really publicize very effectively. This seemed to be the point Soltis was trying to make, but he is such a smart alec sometimes that you cannot really be sure. He also said that the press took some of his comments and misrepresented what he said about the convergence in the IBM server line, and drew the conclusion that IBM was converging its Power and mainframe processor architectures. “A lot of people misunderstood and were saying that the mainframe is going to the Power processor,” he explained. “No, it is not.” There has been much speculation, particularly since the exposure of the so-called Project ECLipz, an IBM project that was launched in 2001 supposedly to converge the iSeries, pSeries, and zSeries product lines. The two prior families use Power processors, while the latter use mainframe-style processors. An awful lot of people expect the Power6 machines to be able to support mainframe workloads. IBM has not confirmed or denied this in any official manner, and Soltis’ vague comments on it have to be taken with a grain of salt. For instance, just because Power6 doesn’t support mainframe workloads natively doesn’t mean that they can’t be emulated using software. One such tool already exists: QuickTransit from Transitive. This software was used by Silicon Graphics to run Irix-MIPS applications written for its vintage Origin servers and workstations in emulation mode on its Linux-Itanium Altix supercomputers. Apple is also using a variant of QuickTransit to create the “Rosetta” environment, which allows Mac OS applications written for Power-based processors to run on its new Intel 64-bit Core Solo and Core Duo processors. And just because the Power6 processor doesn’t support native mainframe instructions doesn’t mean that future Power7 chips won’t. For instance, it is not hard to imagine IBM creating a hybrid Power-mainframe processor complex and porting over some functions–TCP/IP, database acceleration, and I/O processing–to non-mainframe processors and throwing in a mainframe-style core into the mix. This way, IBM could design one processor, perhaps with several Power cores and one mainframe core, and just make this one design rather than two separate designs. See how vague statements only lead to more speculation? And IT vendors know this. But they just can’t stop being vague because they like the sport of telling people they guessed wrong and the anticipation it builds for a future product. And, the IT analyst and press communities and some of the IT shops that end up buying this gear all like the speculation, too–no doubt about that. Since last fall, IBM has been wanting people to guess about what is inside the Power6 chip, which will ship in 2007, and because the only way to keep the guessing game going is to give out a few clues, it did this a few times. Soltis gave out a few more clues to the Power6 last week. Here’s what we knew about Power6 up until now. Last fall, Vijay Lund, vice president of server and storage development at IBM’s Systems and Technology Group, said that the Power6 chip would have approximately 750 million transistors, would be implemented in 65 nanometer technology, and would use a new kind of chip interconnect called “C4,” which at least to my eye is very similar to the socket 1237 interconnect that AMD has created for the future “Rev F” Opteron processors due later this year. Lund said last fall that the Power6 chip would come to market in 2007 and hinted that many system functions were going to be incorporated into the Power6 chip–things that might have otherwise ended up in custom ASICs on the systems or inside low-level microcode. At that time, IBM would not say how many cores were in the Power6 chip, but with the Power5 chip having 276 million transistors that comprised two Power cores, each with dual floating point units and simultaneous multithreading and L1 caches, as well as a shared L2 cache (1.9 MB) and on-chip memory controllers, the Power6 could easily hold a lot more cores with around 750 million transistors. In February, IBM presented a bunch of papers at the International Solid-State Circuits Conference, after it had attained second silicon with the Power6 chip, and said that the chip would be a dual-core processor, just like Power4 and Power5 before it. Which led everyone to wonder what those other several hundred million extra transistors were doing. Even if IBM put dedicated L2 caches on each core and brought L3 caches onto the chip, it would be hard to burn through all those transistors unless the L3 caches were very, very large. Intel’s dual-core Montecito has dual 12 MB L3 caches, one for each core, and has a stunning 1.7 billion transistors. But because the Power cores have a very sophisticated interconnect that runs at half clock speed, the Power chips have not needed on-chip L3 cache, much less large L2 caches. Back at ISSCC in February, IBM finally confirmed that the Power6 chip is a dual-core processor, and that it would look very much like a Power5 and Power5+ chip conceptually. IBM also said that the chip was expected to be delivered in the range of 4 GHz to 5 GHz, roughly twice as fast as the current clock speed of the Power5 and Power5+ chips, and said further that the instruction pipeline in the Power6 chip would not be changed all that much, which is a bit surprising since cranking up the clock speed on a processor usually means lengthening the pipeline. But, IBM has added sophisticated circuits that get each transistor to do more work. Essentially, IBM can move stuff through the Power6 pipeline and keep it better fed without lengthening it, which in turn means it can jack up the clock frequency and do roughly twice as much work in the same or lower power envelope. And, because IBM hasn’t messed with the pipeline, it can also make lower clock speed versions of the Power6 chip that will throw off a lot less heat than the current Power5+ chips. In his talk last week, Soltis shed some light on what all of those extra transistors might be doing. He said that the Power6 chips would have “total, three-way redundancy” for many of the components on the chip–and he was not more specific about which features would be redundant. He described a scenario where the triply redundant components vote to see what course of action they are supposed to take, which will ensure higher levels of reliability. “We are making sure we have continuous availability,” Soltis explained. “We are building in a tremendous amount of hardware redundancy.” For you space buffs, this triple redundancy scenario will be familiar: this is exactly how the systems that control the Space Shuttle work. Having redundant components is good, but it is possible for a two machines to have the same error, causing them to execute the wrong maneuver. So the Space Shuttle systems have three computers, who execute instructions and then vote to compare their answers: when all three agree, they execute, or when any two agree, they ignore the third one because it is erroneous. Soltis was not, of course, more specific about how this triple redundancy might be implemented. It is hard to imagine that he literally meant that IBM had put six Power cores on the chip, but this is quite possible even if it does sound like overkill. With the changes in the pipeline and the doubling of clock speed, IBM could double the performance of the Power core jumping from Power5 to Power6 just on clock speed alone. (Soltis confirmed that IBM has 6 GHz chips running in the labs in Poughkeepsie, New York, where it is being designed, and that 4 GHz to 5 GHz was the target speed for production chips.) That would mean that all of those extra transistors could, in fact, be extra cores and extra components, but ones that are used to provide error checking, not scalable performance. But, because no IBMer will just come out and say directly and clearly that the Power6 chip has triple redundant dual-cores that deliver two cores of raw performance but virtually error-free processing. But this seems to be what Soltis was hinting at. Soltis also said something else that was very interesting. Interpreted languages like Java and C+ require lots of clock speed to operate efficiently, and what is clear from current processor designs is that clock speed is no longer a possible driver of performance. (And, to be honest, a 4 GHz or 5 GHz Power6 processor seems to be pushing the thermal limits a bit hard.) In any event, Soltis thinks that given this and the need to get more parallelism into applications, the pendulum could swing back toward compiled languages. “Maybe to get more performance, we should go back to the compiled languages like C, C++ and–are you sitting down?–COBOL and RPG,” he said to a roar of laughter among the OS/400 faithful at the COMMON user group meeting, who mostly program in RPG with a smattering of COBOL and Java. “Now, don’t walk out of here and say ‘Frank Soltis says Java is dead and we should all be using RPG.’ But I think that more sophisticated optimizing compilers are going to better fit the hardware we are developing in the future.” RELATED STORIES Power6 Gets Second Silicon, IBM to Crank the Clock IBM Raises the Curtain a Little on Future Power Chips, i5/OS V5R4 |