Recognition: 3 theorem links
· Lean TheoremDependency Parsing Across the Resource Spectrum: Evaluating Architectures on High and Low-Resource Languages
Pith reviewed 2026-05-08 19:33 UTC · model grok-4.3
The pith
Biaffine LSTM outperforms transformers for dependency parsing in low-resource regimes until data volume reaches a moderate threshold.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Biaffine LSTM consistently outperforms transformer models in low-resource regimes, with transformers recovering their advantage as training data increases; the crossover falls within a resource range typical of treebanks for under-resourced languages, and morphological complexity measured via MATTR emerges as a significant secondary predictor of transformers' relative disadvantage after controlling for corpus size.
What carries the argument
Direct head-to-head evaluation of the Biaffine LSTM and Stack-Pointer Network against pre-trained transformers (AfroXLMR-large and RemBERT) on controlled subsets of training data, with MATTR serving as the measure of morphological complexity.
If this is right
- The Biaffine LSTM is the better choice for building syntactic tools when annotated data is scarce.
- Transformers become preferable once treebank size exceeds the typical range for under-resourced languages.
- Morphological complexity remains an independent factor that favors simpler LSTM parsers.
- Resource-aware parser selection can improve parsing accuracy for languages with small treebanks.
Where Pith is reading between the lines
- The same data-size crossover may appear in other structured prediction tasks such as semantic role labeling or named-entity recognition.
- Targeted data collection for morphologically complex languages could accelerate the point at which transformers become viable.
- Hybrid systems that switch between LSTM and transformer backbones based on available data volume are worth testing.
Load-bearing premise
The ten chosen languages, especially the low-resource African ones, represent broader low-resource conditions, and the MATTR metric isolates morphological complexity independently of data quality or annotation consistency.
What would settle it
Repeating the experiments on a fresh set of low-resource languages while systematically varying training-set size and morphological complexity to check whether the same performance crossover and MATTR correlation appear.
Figures
read the original abstract
Transformer-based models achieve state-of-the-art dependency parsing for high-resource languages, yet their advantage over simpler architectures in low-resource settings remains poorly understood. We evaluate four parsers -- the Biaffine LSTM, Stack-Pointer Network, AfroXLMR-large, and RemBERT -- across ten typologically diverse languages, with a focus on low-resource African languages. We find that the Biaffine LSTM consistently outperforms transformer models in low-resource regimes, with transformers recovering their advantage as training data increases. The crossover falls within a resource range typical of treebanks for under-resourced languages. Morphological complexity (measured via MATTR) emerges as a significant secondary predictor of transformers' relative disadvantage after controlling for corpus size. These results indicate that the Biaffine LSTM may be better suited for syntactic tool development in low-resource regimes until sufficient annotated data is available to leverage the representational capacity of pre-trained transformers.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript evaluates four dependency parsing architectures—the Biaffine LSTM, Stack-Pointer Network, AfroXLMR-large, and RemBERT—across ten typologically diverse languages, emphasizing low-resource African languages. It reports that the Biaffine LSTM outperforms the transformer models in low-resource regimes, that transformers recover their advantage as training data increases, that the performance crossover occurs within a resource range typical of under-resourced treebanks, and that morphological complexity (measured via MATTR) is a significant secondary predictor of transformers' relative disadvantage after controlling for corpus size. These findings are used to recommend the Biaffine LSTM for syntactic tool development in low-resource settings until sufficient data is available.
Significance. If the primary performance curves and crossover point are reproducible, the work provides practical guidance for architecture selection in low-resource dependency parsing, a topic of direct relevance to NLP tool development for under-resourced languages. The empirical focus on a resource spectrum and typologically diverse set (including African languages) adds value beyond single-language studies. The secondary MATTR-based predictor, however, requires stronger justification to support the mechanistic interpretation offered in the abstract.
major comments (1)
- [Results and discussion of secondary predictors] The claim that MATTR measures morphological complexity and serves as a significant secondary predictor (abstract and results/discussion) is not adequately supported. MATTR is a standard lexical-diversity metric (moving-average type-token ratio), while morphological complexity is conventionally quantified by inflectional entropy, paradigm size, or average morphemes per word. The manuscript provides no validation that the regression isolates morphological effects from lexical diversity, annotation consistency, or data quality. Because this predictor is presented as explanatory support for the observed crossover and the recommendation for Biaffine LSTM, the mechanistic interpretation is insecure.
minor comments (2)
- [Methods] Provide explicit details on data splits, hyperparameter search procedures, statistical significance tests for performance differences, and any controls for model size or pretraining corpus overlap to allow full reproducibility of the primary comparisons.
- [Results] Clarify the exact regression model, included covariates, and reported coefficients or p-values for the MATTR analysis so readers can assess the strength of the secondary finding independently.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the major comment regarding the interpretation and justification of MATTR below. The primary empirical findings on parser performance across resource levels and the recommended use of the Biaffine LSTM in low-resource settings are unaffected by this revision.
read point-by-point responses
-
Referee: The claim that MATTR measures morphological complexity and serves as a significant secondary predictor (abstract and results/discussion) is not adequately supported. MATTR is a standard lexical-diversity metric (moving-average type-token ratio), while morphological complexity is conventionally quantified by inflectional entropy, paradigm size, or average morphemes per word. The manuscript provides no validation that the regression isolates morphological effects from lexical diversity, annotation consistency, or data quality. Because this predictor is presented as explanatory support for the observed crossover and the recommendation for Biaffine LSTM, the mechanistic interpretation is insecure.
Authors: We acknowledge the referee's point that MATTR is conventionally a lexical-diversity metric rather than a direct measure of morphological complexity (such as inflectional entropy or paradigm size). The manuscript's phrasing in the abstract and discussion does overstate the direct link. In the revised version we will (1) replace the parenthetical claim with language describing MATTR as a lexical-diversity proxy that correlates with morphological richness in the languages studied, (2) add explicit discussion of the regression controls (corpus size already included as a covariate) and the limitations of this proxy, and (3) tone down the mechanistic interpretation to note that MATTR captures a secondary signal whose precise causal contribution requires further validation with dedicated morphological metrics. These changes will be made in the abstract, results, and discussion sections. revision: yes
Circularity Check
No circularity: direct empirical measurements on held-out data
full rationale
The paper reports performance comparisons of four parsers across ten languages using standard train/dev/test splits and regression to identify predictors. All reported advantages, crossovers, and secondary correlations (including MATTR) are measured outcomes from observed data rather than quantities defined by the analysis itself or reduced to fitted inputs by construction. No derivations, uniqueness theorems, ansatzes, or self-citations are invoked as load-bearing steps in any claimed chain.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
Cost.FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We define: RER_LAS(ℓ) = (LAS_Biaffine LSTM(ℓ) − LAS_TF(ℓ)) / (100 − LAS_Biaffine LSTM(ℓ))
-
Foundation.AlphaCoordinateFixationalpha_pin_under_high_calibration unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
AfroXLMR-large reaches the crossover at around 830 sentences, while RemBERT requires approximately 1,340–1,390.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Aho and Jeffrey D
Alfred V. Aho and Jeffrey D. Ullman , title =. 1972
1972
-
[2]
Publications Manual , year = "1983", publisher =
1983
-
[3]
Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243
-
[4]
Scalable training of
Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of
-
[5]
Dan Gusfield , title =. 1997
1997
-
[6]
Tetreault , title =
Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =
2015
-
[7]
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =
Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =
-
[8]
Can T ype- T oken R atio be U sed to S how M orphological C omplexity of L anguages?
Kimmo Kettunen. Can T ype- T oken R atio be U sed to S how M orphological C omplexity of L anguages?. Journal of Quantitative Linguistics. 2014. doi:10.1080/09296174.2014.911506
-
[9]
A C omparison B etween M orphological C omplexity M easures: T ypological D ata vs
Christian Bentz and Tatyana Ruzsics and Alexander Koplenig and Tanja Samard z i \'c. A C omparison B etween M orphological C omplexity M easures: T ypological D ata vs. L anguage C orpora. Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity ( CL 4 LC ). 2016
2016
-
[10]
Comparing learnability of two dependency schemes: `semantic' ( UD ) and `syntactic' ( SUD )
Tuora, Ryszard and Przepi \'o rkowski, Adam and Leczkowski, Aleksander. Comparing learnability of two dependency schemes: `semantic' ( UD ) and `syntactic' ( SUD ). Findings of the Association for Computational Linguistics: EMNLP 2021. 2021. doi:10.18653/v1/2021.findings-emnlp.256
-
[11]
A fri B ooms: An Online Treebank for A frikaans
Augustinus, Liesbeth and Dirix, Peter and van Niekerk, Daniel and Schuurman, Ineke and Vandeghinste, Vincent and Van Eynde, Frank and van Huyssteen, Gerhard. A fri B ooms: An Online Treebank for A frikaans. Proceedings of the Tenth International Conference on Language Resources and Evaluation ( LREC '16). 2016
2016
-
[12]
Michael A. Covington and Joe D. McFall. Cutting the Gordian knot: T he moving-average type-token ratio ( MATTR ). Journal of Quantitative Linguistics. 2010. doi:10.1080/09296171003643098
-
[13]
Sandra K \"u bler and Ryan McDonald and Joakim Nivre. Dependency Parsing. 2009. doi:10.1007/978-3-031-02131-2
-
[14]
Efficient Second-Order T ree CRF for Neural Dependency Parsing
Yu Zhang and Zhenghua Li and Min Zhang. Efficient Second-Order T ree CRF for Neural Dependency Parsing. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.302
-
[15]
Constraint Grammar: A Language-Independent Framework for Parsing Unrestricted Text
Fred Karlsson and Atro Voutilainen and Juha Heikkilä and Arto Anttila. Constraint Grammar: A Language-Independent Framework for Parsing Unrestricted Text. 1995
1995
-
[16]
A Very Nice Paper To Cite
Firstname1 Lastname1 and Firstname2 Lastname2. A Very Nice Paper To Cite. International Symposium on Computer Architecture. 2000
2000
-
[17]
An Efficient Algorithm for Projective Dependency Parsing
Nivre, Joakim. An Efficient Algorithm for Projective Dependency Parsing. Proceedings of the Eighth International Conference on Parsing Technologies. 2003
2003
-
[18]
Non-Projective Dependency Parsing using Spanning Tree Algorithms
McDonald, Ryan and Pereira, Fernando and Ribarov, Kiril and Haji c , Jan. Non-Projective Dependency Parsing using Spanning Tree Algorithms. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005
2005
-
[19]
A Fast and Accurate Dependency Parser using Neural Networks
Chen, Danqi and Manning, Christopher. A Fast and Accurate Dependency Parser using Neural Networks. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014. doi:10.3115/v1/D14-1082
-
[20]
Neural Computation 9(8), 1735–1780 (1997)
Hochreiter, Sepp and Schmidhuber, J \"u rgen. Long Short-Term Memory. Neural Computation. 1997. doi:10.1162/neco.1997.9.8.1735
-
[21]
and Kaiser, Lukasz and Polosukhin, Illia
Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia. Attention Is All You Need. Advances in Neural Information Processing Systems 30 (NIPS 2017). 2017
2017
-
[22]
75 Languages, 1 Model: Parsing Universal Dependencies Universally
Kondratyuk, Dan and Straka, Milan. 75 Languages, 1 Model: Parsing Universal Dependencies Universally. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. doi:10.18653/v1/D19-1279
-
[23]
Self-attentive Biaffine Dependency Parsing
Li, Ying and Li, Zhenghua and Zhang, Min and Wang, Rui and Li, Sheng and Si, Luo. Self-attentive Biaffine Dependency Parsing. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19). 2019
2019
-
[24]
Applying Occam's Razor to Transformer-Based Dependency Parsing: What Works, What Doesn't, and What is Really Necessary
Grünewald, Stefan and Friedrich, Annemarie and Kuhn, Jonas. Applying Occam's Razor to Transformer-Based Dependency Parsing: What Works, What Doesn't, and What is Really Necessary. Proceedings of the 17th International Conference on Parsing Technologies (IWPT 2021). 2021
2021
-
[25]
On the importance of pre-training data volume for compact language models
Micheli, Vincent and d'Hoffschmidt, Martin and Fleuret, Fran c ois. On the importance of pre-training data volume for compact language models. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.632
-
[26]
Warstadt, Alex and Zhang, Yian and Li, Xiaocheng and Liu, Haokun and Bowman, Samuel R. Learning Which Features Matter: R o BERT a Acquires a Preference for Linguistic Generalizations (Eventually). Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.16
-
[27]
Natural language processing applications for low-resource languages
Pakray, Partha and Gelbukh, Alexander and Bandyopadhyay, Sivaji. Natural language processing applications for low-resource languages. Natural Language Processing. 2025. doi:10.1017/nlp.2024.33
-
[28]
A Survey of the Model Transfer Approaches to Cross-Lingual Dependency Parsing
Jha, Prabhat Kumar and Kumar, Rajesh and Sahula, Vikram. A Survey of the Model Transfer Approaches to Cross-Lingual Dependency Parsing. ACM Transactions on Asian and Low-Resource Language Information Processing. 2020. doi:10.1145/3383772
-
[29]
Cross-lingual dependency parsing for a language with a unique script
Zhou, He and Dakota, Daniel and K \"u bler, Sandra. Cross-lingual dependency parsing for a language with a unique script. Natural Language Processing. 2025. doi:10.1017/nlp.2024.21
-
[30]
Extending Multilingual BERT to Low-Resource Languages
Wang, Zihan and K, Karthikeyan and Mayhew, Stephen and Roth, Dan. Extending Multilingual BERT to Low-Resource Languages. Findings of the Association for Computational Linguistics: EMNLP 2020. 2020. doi:10.18653/v1/2020.findings-emnlp.240
-
[31]
Are All Languages Created Equal in Multilingual BERT ?
Wu, Shijie and Dredze, Mark. Are All Languages Created Equal in Multilingual BERT ?. arXiv preprint arXiv:2005.09093. 2020
-
[32]
Gessler, Luke and Zeldes, Amir. M icro BERT : Effective Training of Low-resource Monolingual BERT s through Parameter Reduction and Multitask Learning. Proceedings of the 2nd Workshop on Multi-lingual Representation Learning (MRL). 2022. doi:10.18653/v1/2022.mrl-1.9
-
[33]
Efficient Language Modeling for Low-Resource Settings with Hybrid RNN - T ransformer Architectures
Lindenmaier, Gabriel and Papay, Sean and Pad \'o , Sebastian. Efficient Language Modeling for Low-Resource Settings with Hybrid RNN - T ransformer Architectures. arXiv preprint arXiv:2502.00617. 2025
-
[34]
The Importance of Being Recurrent for Modeling Hierarchical Structure
Tran, Ke and Bisazza, Arianna and Monz, Christof. The Importance of Being Recurrent for Modeling Hierarchical Structure. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. doi:10.18653/v1/D18-1503
-
[35]
Dehghani, Mostafa and Gouws, Stephan and Vinyals, Oriol and Uszkoreit, Jakob and Kaiser, Łukasz. Universal Transformers. arXiv preprint arXiv:1807.03819. 2018
work page internal anchor Pith review arXiv 2018
-
[36]
and Hinton, Geoffrey E
Sutskever, Ilya and Martens, James and Dahl, George E. and Hinton, Geoffrey E. On the Importance of Initialization and Momentum in Deep Learning. Proceedings of the 30th International Conference on Machine Learning. 2013
2013
-
[37]
Thomas and Frank, Robert and Linzen, Tal
McCoy, R. Thomas and Frank, Robert and Linzen, Tal. Does Syntax Need to Grow on Trees? S ources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks. Transactions of the Association for Computational Linguistics. 2020. doi:10.1162/tacl_a_00304
- [38]
-
[39]
Constraints on Non-Projective Dependency Parsing
Nivre, Joakim. Constraints on Non-Projective Dependency Parsing. 11th Conference of the E uropean Chapter of the Association for Computational Linguistics. 2006
2006
-
[40]
and Pyysalo, Sampo and Schuster, Sebastian and Tyers, Francis and Zeman, Daniel
Nivre, Joakim and de Marneffe, Marie-Catherine and Ginter, Filip and Haji c , Jan and Manning, Christopher D. and Pyysalo, Sampo and Schuster, Sebastian and Tyers, Francis and Zeman, Daniel. U niversal D ependencies v2: An Evergrowing Multilingual Treebank Collection. Proceedings of the Twelfth Language Resources and Evaluation Conference. 2020
2020
-
[41]
SUD or Surface-Syntactic U niversal D ependencies: An annotation scheme near-isomorphic to UD
Gerdes, Kim and Guillaume, Bruno and Kahane, Sylvain and Perrier, Guy. SUD or Surface-Syntactic U niversal D ependencies: An annotation scheme near-isomorphic to UD. Proceedings of the Second Workshop on Universal Dependencies ( UDW 2018). 2018. doi:10.18653/v1/W18-6008
-
[42]
Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018) , year=
Learning Word Vectors for 157 Languages , author=. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018) , year=
2018
-
[43]
2009 , publisher =
Hajič, Jan and Smrž, Otakar and Zemánek, Petr and Pajas, Petr and Šnaidauf, Jan and Beška, Emanuel and Kráčmar, Jakub and Hassanová, Kamila , title =. 2009 , publisher =
2009
-
[44]
, journal =
Probert, Tracy N. , journal =. A comparison of the early reading strategies of. 2019 , doi =
2019
-
[45]
Linguistic Typology , volume =
The world's simplest grammars are creole grammars , author =. Linguistic Typology , volume =. 2001 , doi =
2001
-
[46]
, year =
Donaldson, Bruce C. , year =. A Grammar of
-
[47]
Findings of the Association for Computational Linguistics: NAACL 2024 , month = jun, year =
Low-resource neural machine translation with morphological modeling , author =. Findings of the Association for Computational Linguistics: NAACL 2024 , month = jun, year =. doi:10.18653/v1/2024.findings-naacl.13 , pages =
-
[48]
2021 , publisher =
The Oxford History of Romanian Morphology , author =. 2021 , publisher =
2021
-
[49]
Traitement Automatique des Langues
Un corpus arboré pour le français : le F rench T reebank [A parsed corpus for F rench: the F rench treebank ]. Traitement Automatique des Langues. 2019
2019
-
[50]
Linguistic Data Retrievable from a Treebank
Barbu Mititelu, Verginica and Irimia, Elena. Linguistic Data Retrievable from a Treebank. Proceedings of the Second International Conference on Computational Linguistics in Bulgaria (CLIB 2016). 2016
2016
-
[51]
Ma, Xuezhe , title =
-
[52]
2024 , note =
Zhang, Yu , title =. 2024 , note =
2024
-
[53]
Universal Dependencies
Universal Dependencies Contributors. Universal Dependencies. 2025
2025
-
[54]
Developing U niversal D ependencies for W olof
Dione, Cheikh Bamba. Developing U niversal D ependencies for W olof. Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019). 2019. doi:10.18653/v1/W19-8003
-
[55]
How Multilingual is Multilingual BERT ?
Pires, Telmo and Schlinger, Eva and Garrette, Dan. How multilingual is Multilingual BERT?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1493
-
[56]
Unsupervised Cross-lingual Representation Learning at Scale
Conneau, Alexis and Khandelwal, Kartikay and Goyal, Naman and Chaudhary, Vishrav and Wenzek, Guillaume and Guzm \'a n, Francisco and Grave, Edouard and Ott, Myle and Zettlemoyer, Luke and Stoyanov, Veselin. Unsupervised Cross-lingual Representation Learning at Scale. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. ...
-
[57]
Rethinking Embedding Coupling in Pre-trained Language Models
Chung, Hyung Won and Fevry, Thibault and Tsai, Henry and Johnson, Melvin and Ruder, Sebastian. Rethinking Embedding Coupling in Pre-trained Language Models. arXiv preprint arXiv:2010.12821. 2020
-
[58]
Nekoto, Wilhelmina and Marivate, Vukosi and Matsila, Tshinondiwa and Fasubaa, Timi and Fagbohungbe, Taiwo and Akinola, Solomon Oluwole and Muhammad, Shamsuddeen and Kabongo Kabenamualu, Salomon and Osei, Salomey and Sackey, Freshia and Niyongabo, Rubungo Andre and Macharm, Ricky and Ogayo, Perez and Ahia, Orevaoghene and Berhe, Musie Meressa and Adeyemi, ...
-
[59]
Lost in the Middle: How Language Models Use Long Contexts
Bojanowski, Piotr and Grave, Edouard and Joulin, Armand and Mikolov, Tomas. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics. 2017. doi:10.1162/tacl\_a\_00051
work page internal anchor Pith review doi:10.1162/tacl 2017
-
[60]
SALT -31: A Machine Translation Benchmark Dataset for 31 U gandan Languages
Nsumba, Solomon and Akera, Benjamin and Ouma, Evelyn Nafula and Ssentanda, Medadi and Kawalya, Deo and Bainomugisha, Engineer and Mwebaze, Ernest Tonny and Quinn, John. SALT -31: A Machine Translation Benchmark Dataset for 31 U gandan Languages. Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026). 2026
2026
-
[61]
I bom NLP : A Step Toward Inclusive Natural Language Processing for N igeria's Minority Languages
Kalejaiye, Oluwadara and Beyene, Luel Hagos and Adelani, David Ifeoluwa and Edet, Mmekut-mfon Gabriel and Akpan, Aniefon Daniel and Urua, Eno-Abasi and Andy, Anietie. I bom NLP : A Step Toward Inclusive Natural Language Processing for N igeria's Minority Languages. Proceedings of the 14th International Joint Conference on Natural Language Processing and t...
2025
-
[62]
and Adelani, David Ifeoluwa and Mosbach, Marius and Klakow, Dietrich
Alabi, Jesujoba O. and Adelani, David Ifeoluwa and Mosbach, Marius and Klakow, Dietrich. Adapting Pre-trained Language Models to A frican Languages via Multilingual Adaptive Fine-Tuning. Proceedings of the 29th International Conference on Computational Linguistics. 2022
2022
-
[63]
Adelani, David Ifeoluwa and Neubig, Graham and Ruder, Sebastian and Rijhwani, Shruti and Beukman, Michael and Palen-Michel, Chester and Lignos, Constantine and Alabi, Jesujoba O. and Muhammad, Shamsuddeen H. and Nabende, Peter and Dione, Cheikh M. Bamba and Bukula, Andiswa and Mabuya, Rooweither and Dossou, Bonaventure F. P. and Sibanda, Blessing and Buza...
-
[64]
Dione, Cheikh M. Bamba and Adelani, David Ifeoluwa and Nabende, Peter and Alabi, Jesujoba and Sindane, Thapelo and Buzaaba, Happy and Muhammad, Shamsuddeen Hassan and Emezue, Chris Chinenye and Ogayo, Perez and Aremu, Anuoluwapo and Gitau, Catherine and Mbaye, Derguene and Mukiibi, Jonathan and Sibanda, Blessing and Dossou, Bonaventure F. P. and Bukula, A...
-
[65]
Adelani, David Ifeoluwa and Masiak, Marek and Azime, Israel Abebe and Alabi, Jesujoba and Tonja, Atnafu Lambebo and Mwase, Christine and Ogundepo, Odunayo and Dossou, Bonaventure F. P. and Oladipo, Akintunde and Nixdorf, Doreen and Emezue, Chris Chinenye and Al-azzawi, Sana and Sibanda, Blessing and David, Davis and Ndolela, Lolwethu and Mukiibi, Jonathan...
-
[66]
AfriBERTa: Towards Viable Multilingual Language Models for Low-Resource Languages
Ogueji, Kelechi. AfriBERTa: Towards Viable Multilingual Language Models for Low-Resource Languages. 2022
2022
-
[67]
Muhammad, Shamsuddeen Hassan and Abdulmumin, Idris and Ayele, Abinew Ali and Ousidhoum, Nedjma and Adelani, David Ifeoluwa and Yimam, Seid Muhie and Ahmad, Ibrahim Sa'id and Beloucif, Meriem and Mohammad, Saif M. and Ruder, Sebastian and Hourrane, Oumaima and Brazdil, Pavel and Jorge, Alipio and Ali, Felermino D \'a rio M \'a rio Ant \'o nio and David, Da...
-
[68]
AfricaNLP Resources
Adelani, David Ifeoluwa. AfricaNLP Resources. 2022
2022
-
[69]
arXiv preprint arXiv:2307.13405 , year=
Bella, G \'a bor and others. Towards Bridging the Digital Language Divide. arXiv preprint arXiv:2307.13405. 2023
-
[70]
Natural Language Processing for African Languages
Adelani, David Ifeoluwa. Natural Language Processing for African Languages. arXiv preprint arXiv:2507.00297. 2025
-
[71]
BERT: Pre- training of Deep Bidirectional Transformers for Language Understanding
Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v1/N19-1423
-
[72]
A Surface-Syntactic UD Treebank for N aija
Caron, Bernard and Courtin, Marine and Gerdes, Kim and Kahane, Sylvain. A Surface-Syntactic UD Treebank for N aija. Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019). 2019. doi:10.18653/v1/W19-7803
-
[73]
and others , title =
Buzaaba, H. and others , title =. 2026 , howpublished =
2026
-
[74]
and Santorini, Beatrice and Marcinkiewicz, Mary Ann
Marcus, Mitchell P. and Santorini, Beatrice and Marcinkiewicz, Mary Ann. Building a Large Annotated Corpus of E nglish: The P enn T reebank. Computational Linguistics. 1993
1993
-
[75]
A Gold Standard Dependency Corpus for E nglish
Silveira, Natalia and Dozat, Timothy and de Marneffe, Marie-Catherine and Bowman, Samuel and Connor, Miriam and Bauer, John and Manning, Chris. A Gold Standard Dependency Corpus for E nglish. Proceedings of the Ninth International Conference on Language Resources and Evaluation ( LREC '14). 2014
2014
-
[76]
Manning, Joakim Nivre, and Daniel Zeman
de Marneffe, Marie-Catherine and Manning, Christopher D. and Nivre, Joakim and Zeman, Daniel. U niversal D ependencies. Computational Linguistics. 2021. doi:10.1162/coli_a_00402
-
[77]
A new proof of C ayley's formula for counting labeled trees
Shor, Peter W. A new proof of C ayley's formula for counting labeled trees. Journal of Combinatorial Theory, Series A. 1995. doi:10.1016/0097-3165(95)90022-5
-
[78]
Combining (Second-Order) Graph-Based and Headed-Span-Based Projective Dependency Parsing
Yang, Songlin and Tu, Kewei. Combining (Second-Order) Graph-Based and Headed-Span-Based Projective Dependency Parsing. Findings of the Association for Computational Linguistics: ACL 2022. 2022
2022
-
[79]
Three New Probabilistic Models for Dependency Parsing: An Exploration
Eisner, Jason M. Three New Probabilistic Models for Dependency Parsing: An Exploration. COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics. 1996
1996
-
[80]
On the shortest arborescence of a directed graph
Chu, Yoeng-Jin. On the shortest arborescence of a directed graph. Scientia Sinica. 1965
1965
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.