Internal Microsoft researchers have published a paper that explored OpenAI's newest language model, GPT-4, and according to those researchers, it may be the very first step toward Artificial General Intelligence (AGI).
A new research paper has been published on the arXiv pre-print server by internal Microsoft AI researchers that explored the capabilities of OpenAI's newest language model called GPT-4. According to the team behind the paper, an analysis was conducted on an early iteration of the GPT-4 language model, and based on the results, the researchers believe that GPT-4 could be viewed as an early and incomplete version of AGI. Microsoft's testers write that GPT-4 shows a clear leap in several areas compared to OpenAI's previous model, GPT-3.5.
The team found that GPT-4 achieved close to human-level performance in a range of categories that its previous generation lacked. Those categories were mathematics, coding, vision, medicine, law, psychology, and more. According to the paper, GPT-4 also performed exceptionally well in several exams, scoring, respectively, in the 90th, 88th, and 86th percentiles on the Bar exam, LSAT, and Certified Sommelier theory test.
"Given the breadth and depth of GPT-4's capabilities. We believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system," the researchers write
To illustrate the performance increase with GPT-4, OpenAI's GPT-3.5 model, scored in the bottom 10% in all Bar exams it took. Furthermore, GPT-3.5 was only released late last year, which demonstrates the exponential growth in general intelligence that has taken place in such a short amount of time. While GPT-4 is extremely impressive and is likely the closest AI system to AGI or human-level intelligence that's conscious, it still has its shortcomings.
"We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting," reads the paper. "Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT."
GPT-4 still suffers from many of the same issues that previous iterations of it suffered from, such as hallucinations and some math queries. Notably, the frequency of these problems has dramatically reduced compared to GPT-3.5 or earlier versions. The researchers report that GPT-4 has shown large improvements in other areas it was previously lacking, such as common sense.
So, do all these improvements mean AGI exists? In short, no. And this can be attributed to the definition of AGI and, by extension, intelligence itself. Researchers are yet to agree on a concise definition for AGI or intelligence. However, generally, they agree that AGI has been reached when an AI system is conscious and thinks exactly like a human does. While GPT-4 is outperforming humans in several tasks, it's worth nothing the AI isn't overcoming those obstacles as a human would.
"Our claim that GPT-4 represents progress towards AGI does not mean that it is perfect at what it does, or that it comes close to being able to do anything that a human can do (which is one of the usual definitions of AGI), or that it has inner motivation and goals (another key aspect in some definitions of AGI)," reads the paper.
The researchers note that while GPT-4 is "at or beyond human-level for many tasks," its overall "patterns of intelligence are decidedly not human-like", which they recognize as a major limitation to the argument that a version of AGI is already being achieved. It's also worth noting that Microsoft has invested billions of dollars into OpenAI to receive its GPT structure, which some could view as an intensification for researchers to publish positive reports regarding OpenAI's software.