SWiRL: The business case for AI that thinks like your best problem-solvers

Researchers from Stanford University and Google DeepMind have created Step-Wise Reinforcement Learning (SWiRL), which they say can improve large language models (LLMs) so they can more effectively handle complex, multi-step tasks.
The team behind SWiRL said current methods for training LLMs tend to struggle with complex planning and tool integration, making them unsuitable for real-world applications that often require several steps to complete.
By generating synthetic data and using a specialised reinforcement learning algorithm, SWiRL can train LLMs to break complex issues down into a series of more manageable tasks.
The team tested SWiRL on question-answering and mathematical reasoning tasks, showing it improved accuracy by 11-21% across a range of different datasets, including HotPotQA, GSM8K, MuSiQue and BeerQA.
The researchers said SWiRL could offer benefits to enterprises looking to integrate reasoning models into their applications and workflows.

Fast Feed