A survey of small language models

Chien Van Nguyen, Xuan Shen, Ryan Aponte, Yu Xia, Samyadeep Basu, Zhengmian Hu, Jian Chen, Mihir Parmar, Sasidhar Kunapuli, Joe Barrow, et al · 2024 · arXiv 2410.20011

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 7.0

VocabTailor introduces a decoupled dynamic vocabulary selection framework that reduces vocabulary-related memory in SLMs by up to 99% with minimal task performance loss.

FedDetox: Robust Federated SLM Alignment via On-Device Data Sanitization

cs.CR · 2026-04-08 · unverdicted · novelty 5.0

FedDetox uses on-device knowledge-distilled classifiers to sanitize toxic data in federated SLM training, preserving safety alignment comparable to centralized baselines.

Evaluating Small Language Models for Front-Door Routing: A Harmonized Benchmark and Synthetic-Traffic Experiment

cs.NI · 2026-03-26 · unverdicted · novelty 5.0

Qwen-2.5-3B achieves 0.793 accuracy and 988 ms median latency on six-class task routing but misses the pre-registered viability bar of 0.85 accuracy and 2000 ms P95 latency.

Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices

cs.DC · 2025-03-11 · unverdicted · novelty 2.0

Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.

citing papers explorer

Showing 4 of 4 citing papers.

VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models cs.CL · 2025-08-21 · unverdicted · none · ref 18
VocabTailor introduces a decoupled dynamic vocabulary selection framework that reduces vocabulary-related memory in SLMs by up to 99% with minimal task performance loss.
FedDetox: Robust Federated SLM Alignment via On-Device Data Sanitization cs.CR · 2026-04-08 · unverdicted · none · ref 3
FedDetox uses on-device knowledge-distilled classifiers to sanitize toxic data in federated SLM training, preserving safety alignment comparable to centralized baselines.
Evaluating Small Language Models for Front-Door Routing: A Harmonized Benchmark and Synthetic-Traffic Experiment cs.NI · 2026-03-26 · unverdicted · none · ref 4
Qwen-2.5-3B achieves 0.793 accuracy and 988 ms median latency on six-class task routing but misses the pre-registered viability bar of 0.85 accuracy and 2000 ms P95 latency.
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices cs.DC · 2025-03-11 · unverdicted · none · ref 59
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.

A survey of small language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer