Muril: Multilingual representations for indian languages

Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, et al · 2021 · arXiv 2103.10730

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

representative citing papers

BhashaSutra: A Task-Centric Unified Survey of Indian NLP Datasets, Corpora, and Resources

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

A unified survey that consolidates Indian NLP resources by task, language, domain, and modality while identifying gaps in coverage and generalization.

Human-Centered Supervision for Sentiment Analysis in Telugu: A Systematic Inquiry Beyond Accuracy

cs.CL · 2025-08-02 · unverdicted · novelty 7.0

Human rationales in supervision for Telugu sentiment analysis improve model alignment with human reasoning and often produce gains in predictive performance.

ks-pret-5m: a 5 million word, 12 million token kashmiri pretraining dataset

cs.CL · 2026-04-13 · accept · novelty 6.0

KS-PRET-5M is a newly released 5.09 million word Kashmiri pretraining dataset containing 12.13 million subword tokens after MuRIL tokenization, made available as a continuous text stream under CC BY 4.0.

Mitigating Extrinsic Gender Bias for Bangla Classification Tasks

cs.CL · 2024-11-16 · unverdicted · novelty 5.0

Constructs gender-perturbed Bangla classification benchmarks and proposes RandSymKL debiasing that reduces extrinsic gender bias in pretrained models.

MKJ at SemEval-2026 Task 9: A Comparative Study of Generalist, Specialist, and Ensemble Strategies for Multilingual Polarization

cs.CL · 2026-04-23 · unverdicted · novelty 4.0

A language-adaptive combination of generalist, specialist, and ensemble transformer models achieves 0.796 macro F1 and 0.826 accuracy on multilingual polarization detection across 22 languages.

Scripts Through Time: A Survey of the Evolving Role of Transliteration in NLP

cs.CL · 2026-04-20 · unverdicted · novelty 3.0

A survey that taxonomizes motivations for transliteration in cross-lingual NLP, reviews incorporation approaches and their evolution, analyzes trade-offs in settings like code-mixing and language families, and offers implementation recommendations.

citing papers explorer

Showing 6 of 6 citing papers.

BhashaSutra: A Task-Centric Unified Survey of Indian NLP Datasets, Corpora, and Resources cs.CL · 2026-04-20 · unverdicted · none · ref 27
A unified survey that consolidates Indian NLP resources by task, language, domain, and modality while identifying gaps in coverage and generalization.
Human-Centered Supervision for Sentiment Analysis in Telugu: A Systematic Inquiry Beyond Accuracy cs.CL · 2025-08-02 · unverdicted · none · ref 28
Human rationales in supervision for Telugu sentiment analysis improve model alignment with human reasoning and often produce gains in predictive performance.
ks-pret-5m: a 5 million word, 12 million token kashmiri pretraining dataset cs.CL · 2026-04-13 · accept · none · ref 6
KS-PRET-5M is a newly released 5.09 million word Kashmiri pretraining dataset containing 12.13 million subword tokens after MuRIL tokenization, made available as a continuous text stream under CC BY 4.0.
Mitigating Extrinsic Gender Bias for Bangla Classification Tasks cs.CL · 2024-11-16 · unverdicted · none · ref 16
Constructs gender-perturbed Bangla classification benchmarks and proposes RandSymKL debiasing that reduces extrinsic gender bias in pretrained models.
MKJ at SemEval-2026 Task 9: A Comparative Study of Generalist, Specialist, and Ensemble Strategies for Multilingual Polarization cs.CL · 2026-04-23 · unverdicted · none · ref 17
A language-adaptive combination of generalist, specialist, and ensemble transformer models achieves 0.796 macro F1 and 0.826 accuracy on multilingual polarization detection across 22 languages.
Scripts Through Time: A Survey of the Evolving Role of Transliteration in NLP cs.CL · 2026-04-20 · unverdicted · none · ref 35
A survey that taxonomizes motivations for transliteration in cross-lingual NLP, reviews incorporation approaches and their evolution, analyzes trade-offs in settings like code-mixing and language families, and offers implementation recommendations.

Muril: Multilingual representations for indian languages

fields

years

verdicts

representative citing papers

citing papers explorer