Large language models are getting bigger and better
Can they keep improving forever?
In AI-land, technologies move from remarkable to old hat at the speed of light. Only 18 months ago the release of ChatGPT, OpenAI’s chatbot, launched an AI frenzy. Today its powers have become commonplace. Several firms (such as Anthropic, Google and Meta) have since unveiled versions of their own models (Claude, Gemini and Llama), improving upon ChatGPT in a variety of ways.
That hunger for the new has only accelerated. In March Anthropic launched Claude 3, which bested the previous top models from OpenAI and Google on various leaderboards. On April 9th OpenAI reclaimed the crown (on some measures) by tweaking its model. On April 18th Meta released Llama 3, which early results suggest is the most capable open model to date. OpenAI is likely to make a splash sometime this year when it releases GPT-5, which may have capabilities beyond any current large language model (LLM). If the rumours are to be believed, the next generation of models will be even more remarkable—able to perform multi-step tasks, for instance, rather than merely responding to prompts, or analysing complex questions carefully instead of blurting out the first algorithmically available answer.
Explore more
This article appeared in the Science & technology section of the print edition under the headline "AI’s next top model"
More from Science and technology
Producing fake information is getting easier
But that’s not the whole story, when it comes to AI
Disinformation is on the rise. How does it work?
Understanding it will lead to better ways to fight it
Fighting disinformation gets harder, just when it matters most
Researchers and governments need to co-ordinate; tech companies need to open up