A Survey of Small Language Models

Ashish Singh; Chien Van Nguyen; Franck Dernoncourt; Hanieh Deilamsalehy; Huanrui Yang; Jian Chen; Jiuxiang Gu; Joe Barrow; Junda Wu; Mihir Parmar

arxiv: 2410.20011 · v1 · pith:N4JWG25Lnew · submitted 2024-10-25 · 💻 cs.CL

A Survey of Small Language Models

Chien Van Nguyen , Xuan Shen , Ryan Aponte , Yu Xia , Samyadeep Basu , Zhengmian Hu , Jian Chen , Mihir Parmar

show 20 more authors

Sasidhar Kunapuli Joe Barrow Junda Wu Ashish Singh Yu Wang Jiuxiang Gu Franck Dernoncourt Nesreen K. Ahmed Nedim Lipka Ruiyi Zhang Xiang Chen Tong Yu Sungchul Kim Hanieh Deilamsalehy Namyong Park Mike Rimer Zhehao Zhang Huanrui Yang Ryan A. Rossi Thien Huu Nguyen

This is my paper

classification 💻 cs.CL

keywords languageslmsmodelssmallsurveytechniquescompressionincluding

0 comments

read the original abstract

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device, mobile, edge devices, among many others. In this article, we present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques. We propose a novel taxonomy for categorizing the methods used to optimize SLMs, including model compression, pruning, and quantization techniques. We summarize the benchmark datasets that are useful for benchmarking SLMs along with the evaluation metrics commonly used. Additionally, we highlight key open challenges that remain to be addressed. Our survey aims to serve as a valuable resource for researchers and practitioners interested in developing and deploying small yet efficient language models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models
cs.CL 2025-08 unverdicted novelty 7.0

VocabTailor introduces a decoupled dynamic vocabulary selection framework that reduces vocabulary-related memory in SLMs by up to 99% with minimal task performance loss.
FedDetox: Robust Federated SLM Alignment via On-Device Data Sanitization
cs.CR 2026-04 unverdicted novelty 5.0

FedDetox uses on-device knowledge-distilled classifiers to sanitize toxic data in federated SLM training, preserving safety alignment comparable to centralized baselines.
Evaluating Small Language Models for Front-Door Routing: A Harmonized Benchmark and Synthetic-Traffic Experiment
cs.NI 2026-03 unverdicted novelty 5.0

Qwen-2.5-3B achieves 0.793 accuracy and 988 ms median latency on six-class task routing but misses the pre-registered viability bar of 0.85 accuracy and 2000 ms P95 latency.
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
cs.DC 2025-03 unverdicted novelty 2.0

Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.