arxiv: 2604.21637 · v1 · submitted 2026-04-23 · 💻 cs.CL · cs.CY

Recognition: unknown

Multilinguality at the Edge: Developing Language Models for the Global South

Lester James V. Miranda , Songbo Hu , Roi Reichart , Anna Korhonen

Authors on Pith no claims yet

Pith reviewed 2026-05-09 21:28 UTC · model grok-4.3

classification 💻 cs.CL cs.CY

keywords multilingualityedge deploymentlanguage modelsGlobal Southlast mileNLP pipelineinclusive technologieshardware constraints

0 comments

The pith

The last mile intersection of multilinguality and edge deployment limits language model access in the Global South.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper claims that effective language model deployment requires jointly addressing multilingual support and edge hardware constraints, as these areas have aligned goals but often competing technical demands. Linguistically diverse communities in the Global South typically face the harshest infrastructure limits, yet multilingual NLP and edge computing research have stayed largely separate. The authors survey 232 papers spanning the full language modeling pipeline to map the current state, surface specific challenges, and outline open questions with recommendations for different stakeholders. A sympathetic reader would care because resolving this intersection determines whether language technologies can reach the communities that need them most. The work positions the combined study as both a practical necessity and a research opportunity.

Core claim

The paper establishes that the intersection of multilinguality and edge deployment, called the last mile, is both a need and an opportunity for language models. Linguistically diverse communities often face the most severe infrastructure constraints, but edge and multilingual NLP research remain siloed. A survey of 232 papers across data collection, development, and deployment reveals the state of the art and the challenges of combining the areas. The authors discuss open questions and give actionable recommendations for stakeholders to advance inclusive and equitable language technologies.

What carries the argument

The last mile, the intersection of multilinguality and edge deployment where goals align but technical requirements for supporting many languages compete with those for running on constrained hardware.

If this is right

The survey identifies specific gaps in each pipeline stage that can guide targeted research to reconcile multilingual and efficiency goals.
Stakeholders including researchers, developers, and policymakers receive concrete recommendations for prioritizing inclusive design.
Addressing the open questions can lead to language models that better serve hardware-constrained, linguistically diverse regions.
The combined lens highlights where separate research fields must collaborate to avoid excluding large populations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Hybrid model designs that jointly optimize language coverage and low-power inference could emerge as a direct next step from the identified tensions.
The last mile framing may extend to other AI modalities where diversity and efficiency trade off, such as multimodal models for varied cultural contexts.
Pilot deployments in Global South settings using the survey recommendations would provide empirical tests of whether the challenges are resolvable in practice.
Policy discussions on digital equity could incorporate the technical last mile barriers to better target infrastructure and training investments.

Load-bearing premise

A survey of 232 papers across the language modeling pipeline sufficiently captures the state of the art and the identified challenges can be addressed through existing technical approaches without fundamental incompatibilities.

What would settle it

A follow-up study or real-world deployment that demonstrates inherent incompatibility between multilingual coverage and edge hardware efficiency, such that no current techniques allow usable language models for diverse languages on typical constrained devices.

read the original abstract

Where and how language models (LMs) are deployed determines who can benefit from them. However, there are several challenges that prevent effective deployment of LMs in non-English-speaking and hardware constrained communities in the Global South. We call this challenge the last mile: the intersection of multilinguality and edge deployment, where the goals are aligned but the technical requirements often compete. Studying these two fields together is both a need, as linguistically diverse communities often face the most severe infrastructure constraints, and an opportunity, as edge and multilingual NLP research remain largely siloed. To understand the state of the art and the challenges of combining the two areas, we survey 232 papers that tackle this problem across the language modelling pipeline, from data collection to development and deployment. We also discuss open questions and provide actionable recommendations for different stakeholders in the NLP ecosystem. Finally, we hope that this work contributes to the development of inclusive and equitable language technologies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper claims that the intersection of multilinguality and edge deployment for language models—termed the 'last mile'—represents both a pressing need and an opportunity for NLP research, particularly for linguistically diverse and hardware-constrained communities in the Global South. It supports this by surveying 232 papers across the full language modeling pipeline (data collection through development and deployment), identifying challenges where goals align but technical requirements compete, discussing open questions, and providing actionable recommendations for stakeholders to advance inclusive language technologies.

Significance. If the survey is representative, the work is significant for synthesizing two largely siloed areas of NLP and highlighting how multilingual and edge constraints intersect in ways that affect equitable access. The broad coverage of 232 papers across the pipeline is a clear strength, providing a useful map of existing efforts and a basis for future joint research on inclusive models.

major comments (1)

Introduction (paragraph describing the survey): The paper asserts that it surveys 232 papers to capture the state of the art at the multilinguality-edge intersection but provides no information on search terms, databases, time bounds, inclusion/exclusion criteria, or bias mitigation. This is load-bearing for the central claim, because the asserted siloing of the fields and the specific challenges identified across the pipeline depend on the survey being systematic and complete; without these details, the representativeness of the findings cannot be evaluated.

minor comments (3)

Abstract: The abstract states the survey size but omits any reference to selection methodology, which would help readers immediately gauge scope and rigor.
Pipeline overview section: Summaries of the 232 papers would benefit from a summary table or figure that explicitly maps papers to pipeline stages and challenge categories for easier navigation.
Recommendations: The stakeholder recommendations could be more tightly cross-referenced to the specific challenges identified in the survey results to strengthen their actionability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the significance of synthesizing multilingual and edge deployment research. We address the major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses

Referee: Introduction (paragraph describing the survey): The paper asserts that it surveys 232 papers to capture the state of the art at the multilinguality-edge intersection but provides no information on search terms, databases, time bounds, inclusion/exclusion criteria, or bias mitigation. This is load-bearing for the central claim, because the asserted siloing of the fields and the specific challenges identified across the pipeline depend on the survey being systematic and complete; without these details, the representativeness of the findings cannot be evaluated.

Authors: We agree that the absence of survey methodology details is a significant omission that undermines the ability to assess the representativeness of the 232 papers and the claims about field siloing and pipeline challenges. In the revised manuscript, we will add a new subsection titled 'Survey Methodology' immediately following the introduction. This section will specify: (1) search terms and Boolean combinations (e.g., 'multilingual language model' AND ('edge computing' OR 'on-device' OR 'mobile deployment' OR 'resource-constrained'), plus variants for low-resource languages); (2) databases and sources queried (ACL Anthology, arXiv, Google Scholar, Semantic Scholar, and selected workshop proceedings); (3) time bounds (papers from 2018 to 2024, with key earlier foundational works included via citation chaining); (4) inclusion/exclusion criteria (papers addressing multilingual LM development or deployment, or edge constraints in LM contexts, excluding purely theoretical work without application relevance); and (5) bias mitigation steps (independent screening by two authors with disagreement resolution, use of snowball sampling from seed papers, and explicit documentation of any geographic or language biases in the retrieved set). These additions will directly support evaluation of the survey's scope and the identified 'last mile' challenges. revision: yes

Circularity Check

0 steps flagged

No circularity: literature survey without derivations or self-referential loops

full rationale

This is a literature review surveying 232 papers across the LM pipeline to identify challenges at the multilinguality-edge intersection. It contains no equations, no fitted parameters, no predictions derived from internal data, and no load-bearing self-citations that reduce the central claim to prior author work. The 'last mile' framing is introduced as an organizing lens supported by the external survey results rather than defined circularly or forced by construction. The work is self-contained as an analysis of existing literature.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central framing rests on the domain assumption that multilinguality and edge constraints form a distinct intersecting challenge requiring joint study; no free parameters or invented entities are introduced.

axioms (1)

domain assumption The intersection of multilinguality and edge deployment represents a distinct 'last mile' challenge where goals align but technical requirements compete.
Explicitly stated in the abstract as the core problem definition.

pith-pipeline@v0.9.0 · 5467 in / 1144 out tokens · 39638 ms · 2026-05-09T21:28:40.777441+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 14 canonical work pages · 4 internal anchors

[1]

InFindings of the Association for Com- putational Linguistics: EMNLP 2021, pages 3316– 3333, Punta Cana, Dominican Republic

The Low-Resource Double Bind: An Empir- ical Study of Pruning for Low-Resource Machine Translation. InFindings of the Association for Com- putational Linguistics: EMNLP 2021, pages 3316– 3333, Punta Cana, Dominican Republic. Association for Computational Linguistics. Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R Mortensen, Noah A. Smith...

2021
[2]

Surv., 58(6)

Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges.ACM Comput. Surv., 58(6). Ahmed Attia and Alham Fikri Aji. 2026. Im- proving Low-Resource Machine Translation via Round-Trip Reinforcement Learning.Preprint, arXiv:2601.12535. cs.CL/2601.12535v3. Lucas Bandarkar, Chenyuan Yang, Mohsen Fayyaz, Jun- lin Hu,...

work page arXiv 2026
[3]

Surv., 57(9)

A Primer on Pretrained Multilingual Language Models.ACM Comput. Surv., 57(9). Tianyu Dong, Bo Li, Jinsong Liu, Shaolin Zhu, and Deyi Xiong. 2025. MLAS-LoRA: Language-Aware Parameters Detection and LoRA-Based Knowledge Transfer for Multilingual Machine Translation. In Proceedings of the 63rd Annual Meeting of the As- sociation for Computational Linguistics...

work page arXiv 2025
[4]

Gemma 3 Technical Report

Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Ca- pabilities. InFirst Conference on Language Model- ing. Gemma Team, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Rivière, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jea...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[5]

InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 43–58, Vienna, Austria

M-RewardBench: Evaluating Reward Models in Multilingual Settings. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 43–58, Vienna, Austria. Association for Computational Lin- guistics. Ivan Habernal, Fatemehsadat Mireshghallah, Patricia Thaine, Sepideh Ghanavati, and Oluwaseyi Feyisetan
[6]

InProceedings of the 17th Conference of the European Chapter of the Association for Computa- tional Linguistics: Tutorial Abstracts, pages 27–30, Dubrovnik, Croatia

Privacy-Preserving Natural Language Process- ing. InProceedings of the 17th Conference of the European Chapter of the Association for Computa- tional Linguistics: Tutorial Abstracts, pages 27–30, Dubrovnik, Croatia. Association for Computational Linguistics. Ricardo Hausmann, Dani Rodrik, and Andrés Velasco
[7]

Technical report, John F

Growth Diagnostics. Technical report, John F. Kennedy School of Government, Harvard University. Alex Havrilla and Maia Iyer. 2024. Understanding the Effect of Noise in LLM Training Data with Algorith- mic Chains of Thought.Preprint, arXiv:2402.04004. cs.LG/2402.04004v2. Yifei He, Alon Benhaim, Barun Patra, Praneetha Vad- damanu, Sanchit Ahuja, Parul Chopr...

work page arXiv 2024
[8]

Scaling Laws for Neural Language Models

LongLLMLingua: Accelerating and Enhanc- ing LLMs in Long Context Scenarios via Prompt Compression. InProceedings of the 62nd Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers), pages 1658–1677, Bangkok, Thailand. Association for Computational Linguistics. Min Jiang. 2024. Models of state digital sovereignty from the ...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[9]

cs.LG/2603.18534v1

Data-efficient pre-training by scaling syn- thetic megadocs.Preprint, arXiv:2603.18534. cs.LG/2603.18534v1. Seungduk Kim, Seungtaek Choi, and Myeongho Jeong. 2024b. Efficient and Effective V ocabulary Expan- sion Towards Multilingual Large Language Models. Preprint, arXiv:2402.14714. cs.CL/2402.14714v1. Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Waha...

work page arXiv 2022
[10]

InProceedings of the 40th Interna- tional Conference on Machine Learning, ICML’23

Fast inference from transformers via spec- ulative decoding. InProceedings of the 40th Interna- tional Conference on Machine Learning, ICML’23. JMLR.org. Chong Li, Yingzhuo Deng, Jiajun Zhang, and Chengqing Zong. 2025a. Group then Scale: Dy- namic Mixture-of-Experts Multilingual Language Model. InFindings of the Association for Compu- tational Linguistics...

2025
[11]

InProceed- ings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6342–6353, Singapore

Compressing Context to Enhance Inference Efficiency of Large Language Models. InProceed- ings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6342–6353, Singapore. Association for Computational Linguis- tics. Zihao Li, Shaoxiong Ji, Hengyu Luo, and Jörg Tiede- mann. 2025d. Rethinking Multilingual Continual Pretraining: Da...

2023
[12]

Niklas Muennighoff, Alexander M

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and De- coding the Curse of Multilinguality.Preprint, arXiv:2510.22037. cs.CL/2510.22037v2. Shayne Longpre, Gregory Yauney, Emily Reif, Kather- ine Lee, Adam Roberts, Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David Mimno, and Daphne Ippolito. 2024. A Pretrainer’s Gui...

work page arXiv 2024
[13]

Artificial intelligence index report 2025.arXiv preprint arXiv:2504.07139, 2025

Power Hungry Processing: Watts Driving the Cost of AI Deployment? InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’24, pages 85–99, New York, NY , USA. Association for Computing Machinery. Jessica M. Lundin, Ada Zhang, Nihal Karim, Hamza Louzan, Guohao Wei, David Ifeoluwa Adelani, and Cody Carroll. 2026. The T...

work page arXiv 2024
[14]

Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation

Universal Cross-Tokenizer Distillation via Ap- proximate Likelihood Matching. InThe Thirty-ninth Annual Conference on Neural Information Process- ing Systems. Lester James V . Miranda, Ivan Vuli´c, and Anna Korho- nen. 2026. Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation. Preprint, arXiv:2604.11290. Lester James V...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[15]

NovaSky Team

The Zeno’s Paradox of ‘Low-Resource’ Lan- guages. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 17753–17774, Miami, Florida, USA. Associa- tion for Computational Linguistics. NVIDIA, Aarti Basant, Abhijit Khairnar, Abhijit Paithankar, Abhinav Khattar, Adithya Renduchin- tala, Aditya Malte, Akhiad Bercovich...

work page arXiv 2024
[16]

White Paper, Stanford Institute for Human-Centered Artificial Intelligence and The Asia Foundation

Mind the (Language) Gap: Mapping the Chal- lenges of LLM Development in Low-Resource Lan- guage Contexts. White Paper, Stanford Institute for Human-Centered Artificial Intelligence and The Asia Foundation. Juan N. Pava, Thomas S. Mullaney, Caroline Meinhardt, Audrey Gao, and Diyi Yang. 2026. How Can AI Support Language Digitization and Digital Inclusion? ...

2026
[17]

Lifting the Curse of Multilinguality by Pre- training Modular Transformers. InProceedings of the 2022 Conference of the North American Chap- ter of the Association for Computational Linguistics: Human Language Technologies, pages 3479–3495, Seattle, United States. Association for Computational Linguistics. Jonas Pfeiffer, Ivan Vuli ´c, Iryna Gurevych, and...

2022
[18]

Surv., 55(10)

QA Dataset Explosion A Taxonomy of NLP Resources for Question Answering and Reading Comprehension.ACM Comput. Surv., 55(10). Sebastian Ruder, Ivan Vuli ´c, and Anders Søgaard
[19]

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Liang Wang, Weizhu Chen, and 1 others

Square One Bias in NLP: Towards a Multi- Dimensional Exploration of the Research Manifold. InFindings of the Association for Computational Linguistics: ACL 2022, pages 2340–2354, Dublin, Ireland. Association for Computational Linguistics. Phillip Rust, Jonas Pfeiffer, Ivan Vuli´c, Sebastian Ruder, and Iryna Gurevych. 2021. How Good is Your Tok- enizer? On...

work page arXiv 2022
[20]

InThe Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track

DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track. Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush V osoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan...

work page arXiv 2000
[21]

InProceedings of the 63rd Annual Meet- ing of the Association for Computational Linguistics (Volume 1: Long Papers), pages 18761–18799, Vi- enna, Austria

Global MMLU: Understanding and Address- ing Cultural and Linguistic Biases in Multilingual Evaluation. InProceedings of the 63rd Annual Meet- ing of the Association for Computational Linguistics (Volume 1: Long Papers), pages 18761–18799, Vi- enna, Austria. Association for Computational Lin- guistics. Shivalika Singh, Freddie Vargus, Daniel D’souza, Börje...

work page arXiv 2026
[22]

Olmo 3

Unlocking the Potential of Model Merging for Low-Resource Languages. InFindings of the Associ- ation for Computational Linguistics: EMNLP 2024, pages 8705–8720, Miami, Florida, USA. Association for Computational Linguistics. Team Olmo, , Allyson Ettinger, Amanda Bertsch, Bailey Kuehl, David Graham, David Heineman, Dirk Groeneveld, Faeze Brahman, Finbarr T...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[23]

Prateek Yadav, Derek Tam, Leshem Choshen, Colin A Raffel, and Mohit Bansal

Understanding Large Language Models in Your Pockets: Performance Study on COTS Mobile Devices .IEEE Transactions on Mobile Computing, (01):1–18. Prateek Yadav, Derek Tam, Leshem Choshen, Colin A Raffel, and Mohit Bansal. 2023. TIES-Merging: Re- solving Interference When Merging Models. InAd- vances in Neural Information Processing Systems, volume 36, page...

2023
[24]

multilingual

Privacy-preserving instructions for align- ing large language models. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org. Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A Inan, Gautam Kamath, Janardhan Kulka- rni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang. 2022. Dif...

2022
[25]

The single primary pipeline stage the paper addresses (Data Collection, Pretraining, Post-training, Inference, Evaluation, or Full-Stack if it spans 4+ stages)
[26]

Primary topic(s) or techniques used in the paper
[27]

Primary subject area of the paper based on ACL 2025 Subject Areas

2025
[28]

Modality of the work (e.g., text, speech, multimodal)
[29]

multilingual

Languages studied or supported (if applicable). Use ISO 639-1 codes where possible (e.g., en, fr, de, es). Use "multilingual" if >10 languages
[30]

Names of any models released by the authors (empty list if none)
[31]

Parameter sizes of the models released in billions (empty list if none or not specified)
[32]

Is this paper primarily about efficiency, multilinguality, both, or neither?
[33]

What type of contribution does this paper make (Method, Technique, Evaluation, Survey, Resource, Analysis)?
[34]

Is this paper relevant in the context of multilingual NLP for edge devices? Score from 1 to 5
[35]

State your reason for the relevance score
[36]

Figure 10:Replication: LM Annotation Prompt.Using GPT 4.1 Mini (gpt-4.1-mini-2025-04-14), we annotate several papers based on their title and abstract across several dimensions

Extract a list of free-form keywords that capture the paper’s key concepts, methods, datasets, or findings. Figure 10:Replication: LM Annotation Prompt.Using GPT 4.1 Mini (gpt-4.1-mini-2025-04-14), we annotate several papers based on their title and abstract across several dimensions. We use theoutlines library for structured output generation (Willard an...

2025
[37]

Inference is chat- centered and de- ployed as an API via WhatsApp

where the con- text is improved continuously us- ing past dialogues and expert inputs. Inference is chat- centered and de- ployed as an API via WhatsApp. Evaluation is qualitative, centering on user interviews. Evaluation on expert-annotated golden test set. Ye et al. (2025)Fed- erated few-shot hate speech detection for marginalized commu- nities in low-r...

2025