Why Do Researchers Care About Small Language Models?
Large language models work well because they’re so large. The latest models from OpenAI, Meta and DeepSeek use hundreds of billions of “parameters” — the adjustable knobs that determine connections among data and get tweaked during the training process. With more parameters, the models are better able to identify patterns and connections, which in turn makes them more powerful and accurate.
But this power comes at a cost. Training a model with hundreds of billions of parameters takes huge computational resources. To train its Gemini 1.0 Ultra model, for example, Google reportedly spent $191 million(opens a new tab). Large language models (LLMs) also require considerable computational power each time they answer a request, which makes them notorious energy hogs. A single query to ChatGPT consumes about 10 times(opens a new tab) as much energy as a single Google search, according to the Electric Power Research Institute.
In response, some researchers are now thinking small. IBM, Google, Microsoft and OpenAI have all recently released small language models (SLMs) that use a few billion parameters — a fraction of their LLM counterparts.
Small models are not used as general-purpose tools like their larger cousins. But they can excel on specific, more narrowly defined tasks, such as summarizing conversations, answering patient questions as a health care chatbot and gathering data in smart devices. “For a lot of tasks, an 8 billion parameter model is actually pretty good,” said Zico Kolter(opens a new tab), a computer scientist at Carnegie Mellon University. They can also run on a laptop or cellphone, instead of a huge data center. (There’s no consensus on the exact definition of “small,” but the new models all max out around 10 billion parameters.)