Ilya Sutskever, co-founder of , thinks existing approaches to scaling up large language models have plateaued. For significant future progress, AI labs will need to train smarter, not just bigger, and LLMs will need to think a little bit longer.
Speaking to , Sutskever explained that the pre-training phase of scaling up large language models, such as ChatGPT, is reaching its limits. Pre-training is [[link]] the initial phase that processes huge quantities of uncategorized data to build language patterns and structures within the model.
“The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing,” Sutskever reckons, "scaling the right thing matters more now than ever.”
“It turned out that having a bot think for just 20 seconds in a hand of poker got the same performance boost as scaling up the model by 100,000x and training it for 100,000 times longer,” Noam Brown, an OpenAI researcher who worked on the latest says.
[[link]]
: The top chips from Intel and AMD.
: The right boards.
: Your perfect pixel-pusher awaits.
: Get into the game ahead of the rest.
In other words, having bots think longer rather than just spew out the first thing that comes to mind can deliver better results. If the latter proves a productive approach, the AI hardware industry could shift away from massive training clusters towards banks of GPUs focussed on improved inferencing.
Of course, either way, Nvidia is likely to be ready to take everyone's money. The increase in demand for AI GPUs for inferencing is indeed something Nvidia CEO Jensen Huang recently noted.
"We've now discovered a second scaling law, and this is the scaling [[link]] law at a time of inference. All of these factors have led to the demand for being incredibly high," Huang said recently.
How long it will take for a generation of cleverer bots to appear thanks to these methods isn't clear. But the effort will probably show up in Nvidia's bank balance soon enough.