RPG Code Assist Is The Killer App For AI-Enhanced Power Systems
October 23, 2024 Timothy Prickett Morgan
“I love it when a plan comes together.” – Colonel John “Hannibal” Smith, The A Team.
Look, we take our wisdom and our joy where we can find it here at The Four Hundred, and it does indeed look like Big Blue has a plan that is coming together with regards to generative AI and the Power Systems hardware platform as it relates to the IBM i software platform.
We told you in Monday’s issue that IBM was up to something with regard to the Power Systems line, which was revealed to partners and presumably key customers at the TechXchange 2024 event in Las Vegas on Tuesday. It is beginning to look like IBM will be adding its “Spyre” AI accelerator, which was created by IBM Research and which is being productized, to the Power Systems line. And for IBM i shops at least, a commercial variant of the code assist for RPG – a large language model that understands and speaks good and fluent free-form RPG – looks like it will be a killer application.
In our guessing game story that ran on Monday, we said that it would be interesting to see how the on-chip matrix math and vector accelerators in the Power10 chip could be used to do retrieval augmented generation (RAG) processing against corporate data using Watsonx models and without the need for GPU processing. There could be enough oomph for this to be useful for production applications.
We also said that IBM could create hybrid CPU-GPU machines to augment the AI processing of Power iron – even using inference accelerators in outboard servers linked through memory coherent fabrics. And added that it would be further interesting to see a mix of Power10 processors, AMD Instinct MI300X accelerators, the ROCm library and driver stack from AMD, the Python language, and the PyTorch framework and Llama 3.2 models from Meta Platforms as a counter-balance to Nvidia’s hegemony with GPU accelerators running its AI Enterprise stack.
That’s pretty close, and the latter might happen eventually. But as we were writing late at night on Friday, we completely spaced out on IBM’s homegrown “Spyre” AI Acceleration Unit, which we wrote about over at The Next Platform back in August when it was revealed alongside the “Telum II” z17 mainframe processor. We took a look at the initial AIU back in October 2022, and admonished IBM that this device had better not be a science project and that it had better be a scientific product. And then forgot that this was the obvious AI accelerator choice for Power Systems.
The AIU has a different kind of matrix math unit than is embedded in the Power10 chip, but it is derivative of the matrix math unit in the “Telum” z16 mainframe processor, which is itself derivative of the original AIU developed by IBM Research several years ago. The Spyre card has already been promised for the System z mainframe as an accelerator, and now in announcement letter AD24-2186, IBM released this two sentence statement of direction:
“IBM intends to incorporate the IBM Spyre accelerator in future Power offerings to provide additional AI compute capabilities. Working together, IBM Power processors and IBM Spyre accelerator will enable the next generation infrastructure to scale demanding AI workloads for businesses.”
So that is another piece of the future Power Systems announcement coming before the end of the year, it looks like.
The Spyre chip has 34 cores, and 32 of them are exposed for use with two of them assumed to be duds due to the normal yield issues any chip maker has. Take a look:
To be precise, the Spyre accelerator chip uses Samsung’s 5LPE low-power 5 nanometer process and crams 26 billion transistors in 330 mm2 of area.
Here is a block diagram showing the Spyre chip’s functional units, which are called corelets:
It looks like there are 78 threads in the Spyre corelet, plus eight FP16 accumulation and activation units and eight additional FP16/FP32 accumulation and activation units. Each corelet has a 2 MB scratchpad memory. There is a 32 byte bidirectional ring connecting the cores to each other on the Spyre chip and a 128 byte bidirectional ring that connects the 2 MB scratchpads to each other for a total of 64 MB of usable memory against those 32 usable Spyre cores.
IBM said at Hot Chips 2024 two months ago that the device use low-power LPDDR5 memory that delivers 200 GB/sec of bandwidth into and out of the Spyre chip. That is not a lot in a world where GPUs have bandwidths ranging from 3.4 TB/sec to 8 TB/sec, depending on the number of HBM memory stacks and their generation. But it is enough to make the Spyre chip useful.
Here is what the Spyre PCI-Express accelerator card looks like:
As you can see, there are eight banks of LPDDR5 memory on the Spyre card, which has a capacity of 128 GB. That is on par with the memory capacity of a GPU accelerator from Nvidia or AMD these days, but the memory is obviously a lot slower. The Spyre card delivers more than 300 teraops of AI performance (we presume this is at FP16 resolution) and slides into a single PCI-Express 5.0 x16 slot. It burns only 75 watts, which is a tenth or less of that of a GPU.
Here is where the comparisons get interesting. With eight Spyre cards in a drawer, the cards deliver 1 TB of memory for AI models and 1.6 TB/sec of aggregate memory bandwidth. Provided the models are not too dense – and we think for most enterprises, at least for a while, they won’t be – then a Power Systems expansion drawer with eight of these cards will be a perfect AI sidecar for a lot of generative AI use cases at IBM i shops.
Which leads us to the next statement of direction that IBM put out this week, in announcement letter AD24-2179:
“IBM intends to deliver a code assistant for RPG – a generative AI tool which helps developers of IBM i software understand existing RPG code, create new RPG function using natural language description, and automatically generate test cases for RPG code.”
And there is the reason IBM i shops that need to modernize code will want to buy a bank of Spyre cards. And so long as IBM makes it so that bank of cards doesn’t cost too much and it doesn’t charge too much for this AI-based RPG code assist, given the amount of code that needs to be modernized and the shrinking number of RPG developers, this could be a killer app for the Power Systems business.
We look forward to learning more and seeing this all productized. And it looks like we will find out more in a matter of weeks, if the rumors are right.
RELATED STORIES
IBM Previews New Power Tech At TechXchange Event
GenAI Interest ‘Exploding’ for Modernization on IBM i and Z, Kyndryl Says
IBM Shows Off Next-Gen AI Acceleration, On Chip DPU For Big Iron (The Next Platform)
IBM’s AI Accelerator: This Had Better Not Be Just A Science Project (The Next Platform)
Some Thoughts On Big Blue’s GenAI Strategy For IBM i
How To Contribute To IBM’s GenAI Code Assistant For RPG
IBM Developing AI Coding Assistant for IBM i
The Time Is Now To Get A GenAI Strategy
Top Priorities in 2024: Security and AI
Thoroughly Modern: Proceed With Caution With AI In The Landscape Of Cybersecurity
IBM i Shops Are Still Getting Their Generative AI Acts Together
IBM To Add Generative AI To QRadar
How Long Before Big Blue Brings Code Assist To IBM i?
Generative AI Is Part Of Application Modernization Now
Sticking To The Backroads On This Journey
With Fresche’s New CEO, There Are No Problems, Just Solutions
Enterprises Are Not Going To Miss The Fourth Wave Of AI (The Next Platform)
IBM Introduces watsonx For Governed Analytics, AI
Technology Always Replaces People While Augmenting Others
Thanks to Timothty Prickett Morgan for this incredibty exciting article.
“Which leads us to the next statement of direction that IBM put out this week, in announcement letter AD24-2179:”
“IBM intends to deliver a code assistant for RPG – a generative AI tool which helps developers of IBM i software understand existing RPG code, create new RPG function using natural language description, and automatically generate test cases for RPG code.”
An IBM i code assistant for RPG capability would transform IBM i into a clear long-term winner in so many ways, including being the resulting solution for both IBM i and System Z customers and perhaps millions of new IBM customers.
Fully free RPG is essentially C++, which makes Fully free RPG the natural winner over existing Java, Pythan, PHP, COBOL, native C++ and other languages.
The IBM code assistant for RPG would also imply/require IBM providing and packaging:
– A modern GUI replacement for IBM Source Entry Utility (SEU) Green screens
– An RPG fixed format tp RPG Free form converter (like tha Arcad Transformer product)
– A COBOL to RPG converter to convert IBM i and IBM Sysytem Z COBO:L source preogrmas to RPG Fully Free source programs.
– An IBM convertet tp transfor native C++ source programs tp Fully Free RPG source prgrams
-An IBM i programmer productivity tool to triple or more IBM i prograamer pronductivity and understanding, like the Real-Time Program Audit software (RTPA) software, (another killer application.)
– IBM supplied education and training for the millions of current Java, C++, Pythan, PHP, and COBOL programmers to quickly use this fantasitc new IBM capability on IBM i.
Could IBM actually be doing this incredibly great thing?
.
Thanks for the excellent news coverage.
Having such tool, trained on millions of lines of codes (also fixed), could provide and incredible value and very useful auto code description and explanation, template building, build quality RPG code to call APIs, CRUDs , best practice on cryptography API calls…
As long as they don’t force the users to build an openstack thing just to run it 😛
But even without the card, if allowed by a company, a plugin in RDi or VisualCode just to query a big model trained on a lot of RPG in IBM cloud tuned for RPG and maintained by IBM would be sufficient IMHO, and would provide a rapid start (download and play). Ideally, if allowed, I open a source (or I point it to a github repo) and the model will integrate user code in the “big model”… to those wanting to protect any local source, can just use it in query mode for generation , best practice, code graph explanations…