Editor’s note: Four months after the release of its ChatGPT chatbot, OpenAI unveiled its latest artificial intelligence technology, GPT-4, on Tuesday. Oren Etzioni, former CEO of the Allen Institute for AI, a technical director at the AI2 Incubator, and professor emeritus at the University of Washington, offers his thoughts.
GPT-4 has arrived.
It is substantially more comprehensive and powerful than ChatGPT, which has already taken the world by storm. GPT-4 can diagnose patients, write software, play chess, author articles, and much more.
Last month, OpenAI CEO Sam Altman tweeted: “a new version of Moore’s law that could start soon: the amount of intelligence in the universe doubles every 18 months.”
In the next few years, we will see GPT-4 and its ilk impact our information economy, jobs, education, politics, and even our understanding of what it means to be intelligent and creative. Referring to a GPT model as a blurry JPEG of the internet understates both its current capabilities and future potential.
However, it’s important to point out that technology has some limitations inherent to its “family DNA.”
GPT-4 has some surface problems.
- It is constrained by an extensive set of human-crafted “guardrails” that seek to prevent it from being offensive or off the wall.
- It doesn’t update its knowledge in real time.
- Its command of languages other than English is limited.
- It doesn’t analyze audio or video.
- It still makes arithmetic errors that a calculator would avoid.
However, none of these problems are inherent to the approach. To those who fixate on them, I would say, “don’t bite my finger, look at where I’m pointing.” All of these problems will be overcome in GPT-5 or in a subsequent version from OpenAI or a competitor.
More challenging is the fact that GPT-4 still is not trustworthy. Like ChatGPT, it “hallucinates,” making up facts and even backing those facts up with made-up sources. Worse, it does so with the blithe confidence of a habitual liar.
Like ChatGPT, it can be inconsistent in its responses when probed with multiple questions on the same matter. That happens because it doesn’t have a set of underlying beliefs and values — instead it responds to human inputs based on an obscure combination of its training data and its internal, mathematically formulated objective.
For these reasons, it also exhibits pervasive biases — you would be foolhardy to trust its responses without careful verification. Human programmers who use a GPT-style tool, called GitHub CoPilot, to produce snippets of software code, carefully review and test the software before incorporating it into their hand-written programs. Nevertheless, every generation of the technology makes fewer errors, and we can expect this trend to continue.
Because of this rapid progress, and unprecedented success, it is important to highlight that GPT-4 and the full gamut of similar AI technologies (sometimes called “foundation models” or “generative AI”) have fundamental limitations that will not be overcome in the foreseeable future.
Unlike humans, GPT models don’t have a body. The models rely on secondhand information in their input, which can be distorted or incomplete. Unlike humans, GPT models can simulate empathy but don’t feel it. While simulated empathy has its uses (think of a teenager who needs a shoulder to cry on at 2 a.m. in rural Kansas), it is not the real thing.
While GPT models may seem infinitely creative and surprising in their responses, they cannot design complex artifacts. Perhaps the easiest way to see this is to pose the question: What components of GPT-4 were designed by a generative model? The state-of-the-art in AI teaches us that GPT-4 was built by scaling and tinkering with human-designed models and methods that include Google’s BERT and AI2’s ELMo. Steven Wolfram provided an accessible overview of the technology here.
Regardless of the details, it’s clear that the technology is light years from being able to design itself. Moreover, in order to design a chatbot, you have to start by formulating the objective, the underlying training data, the technical approach, particular subgoals, and more. These are places where experimentation and iteration are required.
You also have to acquire the relevant resources, hire the appropriate people, and more. Of course, all this was done by the talented humans at OpenAI. As I argued in the MIT Technology Review, successfully formulating and executing such endeavors remains a distinctly human capability.
Most important, GPT models are tools that operate at our behest. While remarkably powerful, they are not autonomous. They respond to our commands.
Consider the analogy to self-driving cars. Over the coming years, self-driving cars will become more versatile and increasingly safe, but the cars will not determine the destination we drive to — that decision belongs to the human. Likewise, it is up to us to decide how to use GPT models — for edification or for misinformation.
The great Pablo Picasso famously said, “Computers are useless. They only give you answers.”
While GPT models are far from useless, we are still the ones formulating the fundamental questions and assessing the answers. That will not change anytime soon.