How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs)
1 min read
Summary
Shanghai AI Laboratory has demonstrated in a new study that very small language models can outperform leading large language models in reasoning tasks.
This is achieved using test-time scaling (TTS) techniques, which allow models to use extra compute cycles at inference to improve performance.
TTS can be used in two ways, either by training the models to think more slowly and generate longer strings of chain-of-thought (CoT) tokens, or by using “external TTS” which uses a reward model to improve the response.
The study found that the effectiveness of TTS is directly related to the reasoning ability of the model, with weak reasoning models seeing substantial improvements from TTS, whereas the gains were more limited for strong reasoning models.
This approach could have significant implications for the enterprise AI market, where the need for more efficient yet still effective models is acute.