On Your iRadar: Super Smart Chatbots
April 28, 2021 Alex Woodie
Don’t look now, but thanks to breakthroughs in natural language processing (NLP), we’re on the cusp of having chatbots that can provide human-like question-answering capabilities. Within a matter of years, according to experts, basic communication between companies and their customers will be handled by AI, leaving humans to handle tougher cases and VIPs.
The key breakthrough driving advances in NLP are large language models, specifically a class of language model called a transformer network. These transformer networks use deep learning methods to essentially learn the right sequence for words to go together. The order of words has been one of the toughest challenges for NLP researchers to tackle, but thanks to the advent of transformer networks with billions of parameters and massive training data sets, the AI is starting to figure it out.
These models are being developed by the likes of OpenAI, Google, and others. In 2020, OpenAI, which is an Elon Musk-backed company, unveiled GPT-3, which was the world’s largest transformer network at the time, with 175 billion parameters.
You may have heard something in the news about how GPT-3 could generate fairly comprehensible sentences, or even poems, given a few sample words to go by. Provided access to the pre-trained GPT-3 model via an API, some folks even got the model to generate a working mockup of the Google homepage, proving that it can learn computer programming languages, too (think about the ramifications of THAT one for a second).
Not to be outdone, in early 2021, Google delivered Switch Transformer, which boasted a generative language model with 1.6 trillion parameters, nearly 10 times bigger than GPT-3. Like GPT-3, Switch Transformer “learned” language by ingesting huge amounts of training data, in this case the Colossal Clean Crawled Corpus (C4), a collection of trillions of words drawn from the world’s websites, weighing in at nearly 1TB. (What will the models learn when we set them lose to learn on the entire Internet?)
The amount of computing horsepower and raw data needed to train these massive AI models is tremendous. According to one estimate, GPT-3 required 190,000 kWh of energy to power the Nvidia GPUs that were used to build the word associations, which is equivalent energy draw of several hundred homes. The training was conducted in a Microsoft Azure data center; the Redmond, Washington, software and services giant owns exclusive rights to productize GPT-3.
The size and computing requirements of these language models mostly precludes their use by private enterprises (although the federal government undoubtedly has one of their own). But that doesn’t mean that private enterprises won’t be using them.
Because these language models are already trained to understand words and generate text-based answers to questions, it’s believed that private companies soon will be able to leverage the AI breakthroughs that these models have made. A little bit of fine-tuning of the pre-trained models with the words and terms used by an individual organization, it is believed, will be enough to make these massive models useful.
So what is the connection to IBM i? As previously stated, they won’t run on IBM i — they’ll exist in the cloud, on the hyperscalers’ systems. They probably won’t even run on IBM Power Systems servers, given Google, Microsoft, and Amazon Web Services’ predilection for X86 — at least for the general computing tasks that aren’t handled by high-end GPUs or Google’s own Tensor Processing Units (TPUs), anyway.
But when access to these language models becomes generally available, they will almost certainly touch the data and business logic stored on IBM i systems. Just as human workers in call centers work with IBM i data and programs today, the intelligent chatbots of the future will need access to information residing in IBM i to successfully complete customer transactions. These chatbot-IBM i interactions will almost certainly occur using APIs, with the REST variety being the most popular flavor these days.
All of this is good news for efficiency-loving IBM i shops, if not for folks who hate chatting with robots (a population that includes your friendly neighborhood IBM i newsletter editor, by the way). Because of how generalizable these pre-trained networks are, and the fact that they’re already developed, it should significantly reduce the amount of technical expertise required to take advantage of very sophisticated NLP and natural language generation (NLG) capabilities.
In short, we’re on the cusp of having extremely intelligent chatbots that can not only understand what your customers are saying, but generate a coherent response to them — without the need to put a very expensive and hard-to-find data scientist on your staff. These conversations can occur online and take place entirely in pure text, or they can take place over the phone or video chats, thanks to recent accuracy advances in speech-to-text algorithms that render the written and spoken word nearly interchangeable.
We are not yet to the point where computers can consistently trick a human, which is what the so-called Turing Test measures. Humans still hold the edge, particularly when it comes to complex interactions. But when combined with other advances in AI, including reinforcement learning, federated learning, and the use of synthetic data, the breakthroughs that are occurring in transformer networks bring us one significant step closer to having an artificial general intelligence (AGI) that may some day pass that test, Forrester analyst Kjell Carlsson said.
This AI technology and the business process improvements that it will unlock are not quite ready. But they’re also not as far away as you may think. COVID continues to unleash a torrent of changes in how we work and play. According to a recent study by KPMG, AI adoption is up 37 percentage points in the past year in financial services, 29 percentage points in retail, and 20 percentage points in tech.
The companies that can find cheaper and more efficient ways of servicing customers in a tumultuous environment will be the ones who success in business. There’s no reason why IBM i shops can’t be at the cutting edge of digital innovation and win in the coming AI heyday. If you can encapsulate your core business logic and data and expose it via APIs, then it doesn’t really matter where it runs. It doesn’t need to be in the cloud (although there might be benefits to running it there).
As super smart chatbots start to become the norm, there will be pressure to catch up, so familiarize yourself with the technologies and the potential capabilities now will give you a leg up when your boss comes calling.