Summary

  • Simulated reasoning (SR) models have been trained to output a step-by-step ‘thinking’ process to solve problems, unlike traditional large language models.
  • One research paper found that, while SR models were successful in routine maths problems, when faced with deeper mathematical proofs, they failed.
  • This finding reveals the mathematical limitations of SR models, despite the marketing claims of AI vendors.
  • When presented with problems from the 2025 US Math Olympiad, most SR models scored below 5% correct on average when generating mathematical proofs.
  • The distinction between answering maths problems and maths proofs is the need for the latter to explain reasoning and show that something is true, rather than just giving an answer.

By Benj Edwards

Original Article