SWiRL: The business case for AI that thinks like your best problem-solvers
1 min read
Summary
Researchers from Stanford University and Google DeepMind have created Step-Wise Reinforcement Learning (SWiRL), which they say can improve large language models (LLMs) so they can more effectively handle complex, multi-step tasks.
The team behind SWiRL said current methods for training LLMs tend to struggle with complex planning and tool integration, making them unsuitable for real-world applications that often require several steps to complete.
By generating synthetic data and using a specialised reinforcement learning algorithm, SWiRL can train LLMs to break complex issues down into a series of more manageable tasks.
The team tested SWiRL on question-answering and mathematical reasoning tasks, showing it improved accuracy by 11-21% across a range of different datasets, including HotPotQA, GSM8K, MuSiQue and BeerQA.
The researchers said SWiRL could offer benefits to enterprises looking to integrate reasoning models into their applications and workflows.