A P\={a}ninian Foundation for Indic Language Processing
Pith reviewed 2026-06-26 00:37 UTC · model grok-4.3
The pith
Indic languages share a Pāṇinian morphosyntactic architecture that can unify their NLP systems across genealogical lines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Through more than two millennia of convergence around Sanskrit, Indic languages came to share a morphosyntactic architecture formalized in Pāṇini's grammar, the Astādhyāyī. This cuts across genealogical lines, uniting languages through a common framework. We argue that this Pāṇinian framework supplies a unifying computational architecture the field has lacked, and that benchmarks grounded explicitly in it would make Indic language systems more accurate, more data-efficient, and more transferable, effectively merging many apparently disparate and sparse Indic language resources into a single high-resource metalanguage bedrock.
What carries the argument
The morphosyntactic architecture formalized in Pāṇini's Astādhyāyī, which supplies the shared computational structure across Indic languages.
If this is right
- Indic NLP systems would become more accurate by exploiting the shared structure.
- Systems would require less training data because resources from many languages could be pooled.
- Transfer across languages would improve because the architecture is common rather than language-specific.
- Disparate and sparse datasets would merge into a single high-resource metalanguage bedrock.
- A four-part benchmark suite would render the architecture explicit and ready for practical use.
Where Pith is reading between the lines
- Interpretability techniques could be used to test whether trained models have independently discovered Pāṇini's categories.
- The same logic might apply to other language groups that share an ancient grammatical tradition even if the paper does not explore them.
- If the benchmarks succeed, they could serve as a template for unifying resources in other fragmented NLP domains.
Load-bearing premise
The morphosyntactic architecture formalized in Pāṇini's Astādhyāyī actually cuts across genealogical lines in a manner that is directly actionable and beneficial for contemporary neural NLP models and benchmarks.
What would settle it
Empirical results showing that benchmarks and models built on Pāṇinian categories produce no gains in accuracy, data efficiency, or cross-language transfer compared with existing language-specific or family-specific approaches would falsify the central claim.
read the original abstract
More than a billion people communicate in Indic languages, yet the natural language processing infrastructure serving them remains fragmented and underdeveloped. The cause is structural: the field organizes its tools and benchmarks around individual languages or small subsets of genealogical language families, building separate analyzers, parsers, and datasets for each language and starting over for the next. This overlooks a deep regularity. Through more than two millennia of convergence around Sanskrit, Indic languages came to share a morphosyntactic architecture formalized in P\={a}nini's grammar, the Ast\={a}dhy\={a}y\={i}. This cuts across genealogical lines, uniting languages through a common framework. We argue that this P\={a}ninian framework supplies a unifying computational architecture the field has lacked, and that benchmarks grounded explicitly in it would make Indic language systems more accurate, more data-efficient, and more transferable, effectively merging many apparently disparate and sparse Indic language resources into a single high-resource metalanguage bedrock. We propose a four-part benchmark suite to render this shared architecture explicit, measurable, and ready to be leveraged for practical applications. Moreover, we underscore the question it raises for interpretability research: whether neural models trained on these languages come to represent P\={a}nini's categories on their own.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper argues that Pāṇini's Aṣṭādhyāyī formalizes a morphosyntactic architecture shared across Indic languages through historical convergence around Sanskrit, supplying a unifying computational framework that the field has lacked. It claims that benchmarks explicitly grounded in this framework would yield more accurate, data-efficient, and transferable Indic NLP systems by merging disparate resources into a single high-resource metalanguage bedrock, and proposes a four-part benchmark suite to render the architecture explicit and measurable while raising questions about neural model interpretability of Pāṇinian categories.
Significance. If the proposed framework can be operationalized with concrete mappings and benchmarks that deliver measurable gains, the work could address fragmentation in Indic NLP by providing a cross-lingual unifying architecture, potentially improving transfer and efficiency for over a billion speakers. The emphasis on historical grammatical convergence as a computational resource is a novel angle for low-resource language processing.
major comments (2)
- [Abstract] Abstract: The claim that 'benchmarks grounded explicitly in it would make Indic language systems more accurate, more data-efficient, and more transferable' is load-bearing but unsupported, as the manuscript provides neither a formal mapping of any sūtra (e.g., a kāraka or samāsa rule) to a differentiable loss term, embedding objective, or probing task, nor even a schematic definition of the four-part benchmark suite.
- [Abstract] Abstract: The assertion that the Pāṇinian architecture 'cuts across genealogical lines' in a manner 'directly actionable' for neural models is stated as a deep regularity but lacks any concrete examples from distinct language families (e.g., Indo-Aryan vs. Dravidian) showing how specific grammatical categories translate into shared neural objectives or evaluation metrics.
minor comments (1)
- [Abstract] The manuscript refers to 'a four-part benchmark suite' without naming or outlining the parts, which reduces clarity of the proposal even at a high level.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the abstract. The manuscript is a position paper that identifies a unifying Pāṇinian architecture and proposes a benchmark direction rather than delivering implemented mappings or results. We respond to each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that 'benchmarks grounded explicitly in it would make Indic language systems more accurate, more data-efficient, and more transferable' is load-bearing but unsupported, as the manuscript provides neither a formal mapping of any sūtra (e.g., a kāraka or samāsa rule) to a differentiable loss term, embedding objective, or probing task, nor even a schematic definition of the four-part benchmark suite.
Authors: We agree the abstract states a forward-looking claim without implementation details. This paper introduces the conceptual framework and the existence of the four-part benchmark suite as a proposal for the community; it does not claim to have derived differentiable objectives or executed the mappings. To make the proposal more actionable, we will add a schematic outline of the four-part benchmark suite in the revised manuscript. revision: partial
-
Referee: [Abstract] Abstract: The assertion that the Pāṇinian architecture 'cuts across genealogical lines' in a manner 'directly actionable' for neural models is stated as a deep regularity but lacks any concrete examples from distinct language families (e.g., Indo-Aryan vs. Dravidian) showing how specific grammatical categories translate into shared neural objectives or evaluation metrics.
Authors: The manuscript grounds the cross-family claim in the historical convergence around Sanskrit that produced shared morphosyntactic categories. While the abstract is concise, the body discusses these shared features. We will strengthen the abstract and add explicit cross-family examples (e.g., kāraka alignment in Indo-Aryan and Dravidian languages) to illustrate actionability for shared objectives. revision: yes
Circularity Check
No significant circularity; proposal is conceptual argument without self-referential derivations
full rationale
The manuscript advances a conceptual proposal that Pāṇinian morphosyntax supplies a shared architecture across Indic languages and that benchmarks grounded in it would yield accuracy and transfer gains. No equations, fitted parameters, predictions, or derivations appear in the provided text. The central claim is an argument from historical convergence rather than a reduction of any output quantity to an input defined by the paper itself. No self-citations are invoked as load-bearing uniqueness theorems, and no ansatz or renaming of known results is presented as a derivation. The proposal therefore remains self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Indic languages share a morphosyntactic architecture formalized in Pāṇini's Astādhyāyī that cuts across genealogical lines.
invented entities (1)
-
Pāṇinian framework as unifying computational architecture and metalanguage bedrock
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Amitav Acharya. 2013. Civilizations in Embrace: The Spread of Ideas and the Transformation of Power : India and Southeast Asia in the Classical Age . Institute of Southeast Asian Studies, Singapore
2013
-
[2]
E. Annamalai. 2024. The Sanskrit Paradigm of Tamil Grammar: Embrace and Resistance. Bhasha 3, 1 (April 2024), 1–16. doi:10.30687/bhasha/2785-5953/2024/01/002 A Pāṇinian Foundation for Indic Language Processing 13
-
[3]
Niyati Bafna, Cristina España-Bonet, Josef Van Genabith, Benoît Sagot, and Rachel Bawden. 2023. Cross-lingual Strategies for Low-resource Language Modeling: A Study on Five Indic Dialects. In Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : travaux de recherche originaux – article...
2023
-
[4]
Varshney
Razan Baltaji, Saurabh Pujar, Martin Hirzel, Louis Mandel, Luca Buratti, and Lav R. Varshney. 2025. Cross-lingual Transfer in Programming Languages: An Extensive Empirical Study. Transactions on Machine Learning Research 2025, June (2025), 26 pages. https://openreview.net/forum?id=1PRBHKgQVM
2025
-
[5]
Tamali Banerjee and Pushpak Bhattacharyya. 2018. Meaningless yet meaningful: Morphology grounded subword- level NMT. In Proceedings of the Second Workshop on Subword/Character LEvel Models , Manaal Faruqui, Hinrich Schütze, Isabel Trancoso, Yulia Tsvetkov, and Yadollah Yaghoobzadeh (Eds.). Association for Computational Linguis- tics, New Orleans, 55–60. d...
-
[6]
Ajitesh Bankula and Praney Bankula. 2025. Cross-Linguistic Transfer in Multilingual NLP: The Role of Language Families and Morphology. arXiv: 2505.13908
arXiv 2025
-
[7]
Ramakrishna- macharyulu
Akshar Bharati, Rajeev Sangal, Vineet Chaitanya, Amba Kulkarni, Dipti Misra Sharma, and K.V. Ramakrishna- macharyulu. 2002. AnnCorra: Building Tree-banks in Indian Languages. In COLING-02: The 3rd Workshop on Asian Language Resources and International Standardization . Association for Computational Linguistics, Taipei, Taiwan, 8 pages. https://aclantholog...
2002
-
[8]
Rajesh Bhatt, Bhuvana Narasimhan, Martha Palmer, Owen Rambow, Dipti Sharma, and Fei Xia. 2009. A Multi- Representational and Multi-Layered Treebank for Hindi/Urdu. In Proceedings of the Third Linguistic Annotation Work- shop (LA W III), Manfred Stede, Chu-Ren Huang, Nancy Ide, and Adam Meyers (Eds.). Association for Computational Linguistics, Suntec, Sing...
2009
-
[9]
Soham Bhattacharjee, Mukund K. Roy, Yathish Poojary, Bhargav Dave, Mihir Raj, Vandan Mujadia, Baban Gain, Pruthwik Mishra, Arafat Ahsan, Parameswari Krishnamurthy, Ashwath Rao, Gurpreet Singh Josan, Preeti Dubey, Aadil Amin Kak, Anna Rao Kulkarni, Narendra V. G., Sunita Arora, Rakesh Balbantray, Prasenjit Majumdar, Karunesh K. Arora, Asif Ekbal, and Dipti...
arXiv 2025
-
[10]
Norman Blake. 1996. A History of the English Language . New York University Press, New York
1996
-
[11]
Maharaj Brahma, N J Karthika, Atul Singh, Devaraj Adiga, Smruti Bhate, Ganesh Ramakrishnan, Rohit Saluja, and Maunendra Sankar Desarkar. 2025. MorphTok: Morphologically Grounded Tokenization for Indian Languages . arXiv:2504.10335
arXiv 2025
-
[12]
Jannik Brinkmann, Chris Wendler, Christian Bartelt, and Aaron Mueller. 2025. Large Language Models Share Rep- resentations of Latent Grammatical Concepts Across Typologically Diverse Languages. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume ...
-
[13]
Chang, Catherine Arnett, Zhuowen Tu, and Ben Bergen
Tyler A. Chang, Catherine Arnett, Zhuowen Tu, and Benjamin K. Bergen. 2024. When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for Computational Linguisti...
-
[14]
Suniti Kumar Chatterji. 1926. The Origin and Development of the Bengali Language. Calcutta University Press, Calcutta, India
1926
-
[15]
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , Dan Jurafsky, Joyce Chai, N...
-
[16]
Alexis Conneau, Shijie Wu, Haoran Li, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Emerging Cross-lingual Struc- ture in Pretrained Language Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Lin- guistics,...
-
[17]
and Nivre, Joakim and Zeman, Daniel , year = 2021, month = may, journal =
Marie-Catherine de Marneffe, Christopher D. Manning, Joakim Nivre, and Daniel Zeman. 2021. Universal Dependen- cies. Computational Linguistics 47, 2 (July 2021), 255–308. doi:10.1162/coli_a_00402
-
[18]
Murray B. Emeneau. 1956. India as a Linguistic Area. Language 32, 1 (1956), 3–16. doi:10.2307/410649
-
[19]
Pawan Goyal and Gerard Huet. 2016. Design and analysis of a lean interface for Sanskrit corpus annotation. Journal of Language Modelling 4, 2 (2016), 145–182. doi:10.15398/jlm.v4i2.108 14 Ritwik Banerjee and Lav R. Varshney
-
[20]
Oliver Hellwig. 2016. Improving the Morphological Analysis of Classical Sanskrit. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016) , Dekai Wu and Pushpak Bhattacharyya (Eds.). The COLING 2016 Organizing Committee, Osaka, Japan, 142–151. https://aclanthology.org/W16-3715/
2016
-
[21]
Oliver Hellwig and Erica Biagetti. 2025. The Sanskrit Sembank. Language Resources and Evaluation 59 (2025), 3635–
2025
-
[22]
doi:10.1007/s10579-025-09852-1
-
[23]
Gérard Huet. 2005. A Functional Toolkit for Morphological and Phonological Processing, Application to a Sanskrit Tagger. Journal of Functional Programming 15, 4 (2005), 573–614. doi:10.1017/S0956796804005416
-
[24]
Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. 2024. Position: The Platonic Representation Hy- pothesis. In Proceedings of the 41st International Conference on Machine Learning (Proceedings of Machine Learning Re- search, Vol. 235), Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Feli...
2024
-
[25]
Peter Zilahy Ingerman. 1967. “Pānini-Backus Form” suggested. Commun. ACM 10, 3 (March 1967), 137. doi:10.1145/ 363162.363165
arXiv 1967
-
[26]
Girish Nath Jha. 2010. The TDIL Program and the Indian Langauge Corpora Intitiative (ILCI). In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, and Daniel Tapias (Eds.). European Lan- guage Reso...
2010
-
[27]
Braj B. Kachru. 1992. The other tongue: English across cultures (2 ed.). University of Illinois Press, Urbana, Illinois
1992
-
[28]
Braj B. Kachru. 1992. World Englishes: approaches, issues and resources. Language Teaching 25, 1 (1992), 1–14. doi:10.1017/S0261444800006583
-
[29]
Khapra, and Pratyush Kumar
Divyanshu Kakwani, Anoop Kunchukuttan, Satish Golla, Gokul N.C., Avik Bhattacharyya, Mitesh M. Khapra, and Pratyush Kumar. 2020. IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020 , Trevor Cohn, Yulan He, and Y...
2020
-
[30]
N. J. Karthika, Maharaj Brahma, Rohit Saluja, Ganesh Ramakrishnan, and Maunendra Sankar Desarkar. 2025. Multi- lingual Tokenization through the Lens of Indian Languages: Challenges and Insights . arXiv: 2506.17789 [cs.CL]
Pith/arXiv arXiv 2025
-
[31]
Robert D. King. 2006. The Poisonous Potency of Script: Hindi and Urdu. International Journal of the Sociology of Language 2001, 150 (2006), 43–59. doi:10.1515/ijsl.2001.035
-
[32]
Amrith Krishna, Pavan Kumar Satuluri, and Pawan Goyal. 2017. A Dataset for Sanskrit Word Segmentation. In Pro- ceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Beatrice Alex, Stefania Degaetano-Ortlieb, Anna Feldman, Anna Kazantseva, Nils Reiter, and Stan Sz- pakowicz (Ed...
-
[33]
Sriram Krishnan and Amba Kulkarni. 2019. Sanskrit Segmentation revisited. In Proceedings of the 16th International Conference on Natural Language Processing, Dipti Misra Sharma and Pushpak Bhattacharya (Eds.). NLP Association of India, International Institute of Information Technology, Hyderabad, India, 105–114. https://aclanthology.org/2019. icon-1.12/
2019
-
[34]
Sriram Krishnan, Amba Kulkarni, and Gérard Huet. 2023. Validation and Normalization of DCS corpus and Develop- ment of the Sanskrit Heritage Engine’s Segmenter. In Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference , Amba Kulkarni and Oliver Hellwig (Eds.). Association for Comput...
2023
-
[35]
Anoop Kunchukuttan and Pushpak Bhattacharyya. 2022. Machine Translation and Transliteration involving Related and Low-Resource Languages. CRC Press, Boca Raton, USA and Abingdon, UK
2022
-
[36]
David B. Lurie. 2023. The Vernacular in the World of Wen: Sheldon Pollock’s Model in East Asia? In Cosmopolitan and Vernacular in the World of Wen 文, Ross King (Ed.). Language, Writing and Literary Culture in the Sinographic Cosmopolis, Vol. 5. Brill, Leiden, The Netherlands, 49–68. doi:10.1163/9789004529441_003
-
[37]
Anand Mishra. 2009. Simulating the Pāṇinian System of Sanskrit Grammar. In Sanskrit Computational Linguistics , Gérard Huet, Amba Kulkarni, and Peter Scharf (Eds.). Lecture Notes in Computer Science, Vol. 5402. Springer, Berlin, 127–138. doi:10.1007/978-3-642-00155-0_4
-
[38]
2025.BhashaVerse : Translation Ecosystem for Indian Subcontinent Languages
Vandan Mujadia and Dipti Misra Sharma. 2025.BhashaVerse : Translation Ecosystem for Indian Subcontinent Languages. arXiv:2412.04351
arXiv 2025
-
[39]
Karthika N J, Krishnakant Bhatt, Ganesh Ramakrishnan, and Preethi Jyothi. 2025. LEVOS: Leveraging Vocabulary Overlap with Sanskrit to Generate Technical Lexicons in Indian Languages. In Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025) , Ekaterina Kochmar, Bashar Alhafni, Marie Bexte, Jill Burstein,...
-
[40]
Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly, and Soumen Chakrabarti. 2023. Transfer Learning for Low-Resource Multilingual Relation Classification. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22, 2, Article 50 (March 2023), 24 pages. doi:10.1145/3554734
-
[41]
Sebastian Nehrdich, Oliver Hellwig, and Kurt Keutzer. 2024. One Model is All You Need: ByT5-Sanskrit, a Unified Model for Sanskrit NLP Tasks. In Findings of the Association for Computational Linguistics: EMNLP 2024 , Yaser Al- Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for Computational Linguistics, Miami, Florida, USA, 13742–13751. doi:...
-
[42]
NLLB Team. 2024. Scaling neural machine translation to 200 languages. Nature 630 (2024), 841–846. doi:10.1038/ s41586-024-07335-x
2024
-
[43]
Riya Pal and Dipti Sharma. 2019. Towards Automated Semantic Role Labelling of Hindi-English Code-Mixed Tweets. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), Wei Xu, Alan Ritter, Tim Baldwin, and Afshin Rahimi (Eds.). Association for Computational Linguistics, Hong Kong, China, 291–296. doi:10.18653/v1/D19- 5538
-
[44]
Martha Palmer, Rajesh Bhatt, Bhuvana Narasimhan, Owen Rambow, Dipti Misra Sharma, and Fei Xia. 2009. Hindi Syntax: Annotating Dependency, Lexical Predicate-Argument Structure, and Phrase Structure. In Proceedings of the 7th International Conference on Natural Language Processing (ICON) . Macmillan Publishers, Hyderabad, India, 259– 268
2009
-
[45]
Martha Palmer, Paul Kingsbury, and Daniel Gildea. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics 31, 1 (2005), 71–106. doi:10.1162/0891201053630264
-
[46]
Priyaranjan Pattnayak, Hitesh Patel, and Amit Agarwal. 2025. Tokenization Matters: Improving Zero-Shot NER for Indic Languages. In 2025 IEEE International Conference on Electro Information Technology (eIT) . IEEE, Valparaiso, Indiana, USA, 456–462. doi:10.1109/eIT64391.2025.11103625
-
[47]
Siddhesh Pawar, Pushpak Bhattacharyya, and Partha Talukdar. 2023. Evaluating Cross Lingual Transfer for Mor- phological Analysis: a Case Study of Indian Languages. In Proceedings of the 20th SIGMORPHON workshop on Com- putational Research in Phonetics, Phonology, and Morphology , Garrett Nicolai, Eleanor Chodroff, Frederic Mailhot, and Çağrı Çöltekin (Eds...
-
[48]
Gerald Penn and Paul Kiparsky. 2012. On Pāṇini and the Generative Capacity of Contextualized Replacement Sys- tems. In Proceedings of COLING 2012: Posters , Martin Kay and Christian Boitet (Eds.). The COLING 2012 Organizing Committee, Mumbai, India, 943–950. https://aclanthology.org/C12-2092
2012
-
[49]
Sheldon I. Pollock. 2000. Cosmopolitan and Vernacular in History. Public Culture 12, 3 (2000), 591–625. Project MUSE. https://muse.jhu.edu/article/26221
2000
-
[50]
Pooja Rai, Ayan Das, and Sanjay Chatterji. 2025. Mapping of the Nepali Dependency Treebank to Universal Depen- dencies. ACM Trans. Asian Low-Resour. Lang. Inf. Process.24, 11, Article 132 (Nov. 2025), 22 pages. doi:10.1145/3749643
-
[51]
Vinit Ravishankar. 2017. A Universal Dependencies Treebank for Marathi. In Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories , Jan Hajič (Ed.). Association for Computational Linguistics, Prague, Czech Republic, 190–200. https://aclanthology.org/W17-7623
2017
-
[52]
Rajendran, K
Baskaran Sankaran, Kalika Bali, Monojit Choudhury, Tanmoy Bhattacharya, Pushpak Bhattacharyya, Girish Nath Jha, S. Rajendran, K. Saravanan, L. Sobha, and K.V. Subbarao. 2008. A Common Parts-of-Speech Tagset Framework for Indian Languages. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Nicoletta Calzola...
2008
-
[53]
P., Brejesh Lall, and Shresth Mehta
Arun Kumar Singh, Sushant Dave, Prathosh A. P., Brejesh Lall, and Shresth Mehta. 2020. A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis . arXiv: 2010.12937
arXiv 2020
-
[54]
Abhishek Kumar Singh, Vishwajeet Kumar, Rudra Murthy, Jaydeep Sen, Ashish Mittal, and Ganesh Ramakrishnan
-
[55]
In Findings of the Association for Computational Linguistics: NAACL 2025 , Luis Chiruzzo, Alan Ritter, and Lu Wang (Eds.)
INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages. In Findings of the Association for Computational Linguistics: NAACL 2025 , Luis Chiruzzo, Alan Ritter, and Lu Wang (Eds.). Association for Computational Linguistics, Albuquerque, New Mexico, 2607–2626. doi:10.18653/ v1/2025.findings-naacl.141
2025
-
[56]
Juhi Tandon, Himani Chaudhary, Riyaz Ahmad Bhat, and Dipti Misra Sharma. 2016. Conversion from Pāṇinian Karakas to Universal Dependencies for Hindi Dependency Treebank. In Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LA W-X 2016), Annemarie Friedrich and Katrin Tomanek (Eds.). Associ- ation for Computational Li...
-
[57]
Sarah Grey Thomason. 2000. Linguistic Areas and Language History. In Languages in Contact , D. G. Gilbers, J. Ner- bonne, and J. Schaeken (Eds.). Studies in Slavic and General Linguistics, Vol. 28. Brill, Leiden, The Netherlands, 311–
2000
-
[58]
doi:10.1163/9789004488472_030 16 Ritwik Banerjee and Lav R. Varshney
-
[59]
Srisa Chandra Vasu. 1897. The Ashtādhyāyī of Pāṇini. Sindhu Charan Bose, Benares
-
[60]
Joshi, Aiman A
Devika Verma, Ramprasad S. Joshi, Aiman A. Shivani, and Rohan D. Gupta. 2023. Kāraka-Based Answer Retrieval for Question Answering in Indic Languages. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing , Ruslan Mitkov and Galia Angelova (Eds.). INCOMA Ltd., Shoumen, Bulgaria, Varna, Bulgaria, 1216–1224. h...
2023
-
[61]
Daniel Zeman, Joakim Nivre, Rimsha Abid, Mitchell Abrams, et al. 2026. Universal Dependencies 2.18 . Institute of Formal and Applied Linguistics (ÚFAL), LINDAT/CLARIAH-CZ Digital Library. http://hdl.handle.net/11234/1-6149
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.