Summary

  • As big language models require considerable computational power and resources, researchers are looking at smaller versions that use just a fraction of the parameters.
  • IBM, Google, Microsoft and OpenAI have all developed small language models (SLMs) that use around 10 billion parameters, compared with hundreds of billions for large language models (LLMs).
  • SLMs are trained on high-quality datasets generated by larger models, meaning they require less messy raw training data from the internet.
  • SLMs excel at specific, narrowly defined tasks, such as gathering data in smart devices or healthcare chatbots, and are cheaper to run and more efficient.
  • To optimize training processes, researchers use pruning and other methods to fine-tune the models for particular tasks and environments.
  • The large, expensive LLMs will remain useful for applications such as drug discovery, image generators and generalized chatbots, but smaller models are more efficient for many users.

By Stephen Ornes

Original Article