Arabic Sentence Segmentation Across Genres and Punctuation Conditions

Bashar Alhafni; Khalid N. Elmadani; Mohammed Elkholy; Nizar Habash

arxiv: 2606.08025 · v1 · pith:FYJRYPBSnew · submitted 2026-06-06 · 💻 cs.CL

Arabic Sentence Segmentation Across Genres and Punctuation Conditions

Mohammed Elkholy , Khalid N. Elmadani , Nizar Habash , Bashar Alhafni This is my paper

Pith reviewed 2026-06-27 19:49 UTC · model grok-4.3

classification 💻 cs.CL

keywords Arabic sentence segmentationAraSEG corpuspunctuation ambiguitylightweight encodersLLM evaluationdependency parsing

0 comments

The pith

Lightweight encoder models outperform large language models on Arabic sentence segmentation when punctuation is ambiguous or missing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates AraSEG, a new corpus covering eight Arabic genres under varied punctuation and document conditions, to test sentence boundary detection beyond clean text. Experiments compare LLMs against smaller encoder models and dependency-parser approaches, revealing that the lighter models maintain higher accuracy as segmentation difficulty increases. The work also shows that better sentence splits boost later dependency parsing accuracy, while additional training data yields diminishing returns and cross-genre transfer stays limited.

Core claim

On the AraSEG corpus, lightweight encoder models and dependency parser-based systems achieve higher sentence segmentation accuracy than LLMs under the most challenging punctuation conditions; performance saturates with more training data, cross-genre generalization is difficult, and accurate segmentation measurably improves downstream dependency parsing.

What carries the argument

AraSEG, a multi-genre corpus of Arabic texts annotated for sentence boundaries across eight genres and wide ranges of punctuation presence and document structure.

If this is right

Accurate sentence segmentation substantially improves performance on downstream dependency parsing.
Model performance on sentence segmentation eventually saturates as training data size grows.
Cross-genre generalization for Arabic sentence segmentation remains challenging even with diverse training data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The pattern suggests that for noisy or low-punctuation text, task-specific small models may remain preferable to general-purpose LLMs even as the latter scale.
Similar genre-diverse evaluation setups could be applied to sentence segmentation in other languages that share punctuation ambiguity problems.
Integrating dependency parsing signals directly into segmentation training may offer a reusable strategy for other sequence-labeling tasks with weak boundary cues.

Load-bearing premise

The eight genres and punctuation conditions collected in AraSEG are representative of the range of real-world Arabic text that downstream applications encounter.

What would settle it

A new Arabic test set drawn from additional genres or real documents with punctuation patterns outside those in AraSEG on which LLMs achieve higher segmentation accuracy than the lightweight encoders.

Figures

Figures reproduced from arXiv: 2606.08025 by Bashar Alhafni, Khalid N. Elmadani, Mohammed Elkholy, Nizar Habash.

**Figure 1.** Figure 1: Example Arabic paragraph annotated with sentence segmentation boundaries $. limited benchmarks and evaluation settings for systematically studying the task. Unlike many languages, Arabic often exhibits sparse, inconsistent, or entirely absent punctuation, particularly in historical and literary texts predating the widespread adoption of modern punctuation. Instead, clauses and sentences are often linked… view at source ↗

**Figure 2.** Figure 2: CAMeLBERT Test set performance versus training data size. Dashed lines indicate full-data performance; shaded regions show ±1 standard deviation. data while preserving genre-wise word distributions [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

read the original abstract

Sentence segmentation in Arabic is challenging due to ambiguous and inconsistent punctuation, with many texts lacking reliable sentence boundary markers. Existing approaches rely heavily on punctuation cues and are typically evaluated on well-formed text, limiting their robustness in realistic Arabic settings. To address this, we introduce AraSEG, a genre-diverse sentence segmentation corpus spanning eight genres and a wide range of punctuation and document structure conditions. Using AraSEG, we evaluate LLMs, lightweight encoder models, and dependency parser-based models under increasingly challenging segmentation settings. Our experiments show that lightweight encoders, and even dependency parser-based models, outperform LLMs in the most challenging settings. We further investigate the effects of training data size and genre diversity, finding that performance eventually saturates and cross-genre generalization remains challenging. We also demonstrate that accurate sentence segmentation substantially improves downstream dependency parsing. We make our code, data, and models publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AraSEG is a practical new corpus for Arabic sentence segmentation under messy punctuation, with the model ranking as a secondary result that hinges on the data's coverage.

read the letter

The main thing here is the release of AraSEG, a corpus spanning eight genres with deliberate variation in punctuation and document structure. The experiments compare LLMs, lightweight encoders, and dependency-parser models, and find the simpler ones hold up better in the toughest settings. They also show that better segmentation helps downstream parsing and release the data, code, and models.

The paper does a clean job on the controlled conditions and the public artifacts. That setup lets them test model families head-to-head on the same data, which is straightforward and addresses a real pain point in Arabic NLP where punctuation is often missing or inconsistent.

The soft spot is representativeness. The abstract already notes that cross-genre generalization stays difficult and that performance saturates with more data, so the claim that lightweight models are preferable in challenging conditions rests on whether these eight genres plus the punctuation tweaks actually sample the Arabic text that shows up in applications. If certain informal registers or document lengths are under-covered, the ranking could be tied to this benchmark. The abstract also gives no numbers, tests, or error analysis, which makes it hard to judge how large or stable the differences are.

This is for people working on Arabic preprocessing or robustness in low-resource settings. A reader outside that niche might borrow the data-construction approach but is unlikely to cite the specific results. It has enough new public data and a concrete question to deserve a serious referee, even with the scope limits.

Referee Report

1 major / 1 minor

Summary. The paper introduces AraSEG, a new genre-diverse corpus for Arabic sentence segmentation spanning eight genres and a range of punctuation and document-structure conditions. It evaluates LLMs, lightweight encoder models, and dependency parser-based models under increasingly challenging settings, reporting that lightweight encoders and dependency parsers outperform LLMs in the hardest cases. Additional experiments examine training data size and genre diversity, finding performance saturation and persistent cross-genre generalization difficulties; the work also shows that accurate segmentation improves downstream dependency parsing. Code, data, and models are released publicly.

Significance. If the empirical results hold under broader conditions, the paper supplies a useful public benchmark that questions the default preference for LLMs on Arabic segmentation tasks and demonstrates practical downstream gains. The explicit release of resources is a clear strength that supports reproducibility and follow-on work. The saturation and cross-genre observations usefully temper expectations for scaling-based solutions.

major comments (1)

[Abstract and §4] Abstract and §4 (experiments): the central claim that lightweight encoders and dependency-parser models outperform LLMs in the most challenging settings is evaluated exclusively on AraSEG. Because the abstract itself notes that cross-genre generalization remains challenging, the representativeness of the eight selected genres plus punctuation conditions is load-bearing for any practical recommendation; no quantitative justification or external validation of genre coverage is provided.

minor comments (1)

[Abstract] The abstract states that performance saturates with more data; the corresponding learning curves or data-size ablation table should be referenced explicitly in the main text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and the opportunity to clarify the scope of our claims regarding AraSEG. We address the concern about genre representativeness and the load-bearing nature of the eight genres below.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (experiments): the central claim that lightweight encoders and dependency-parser models outperform LLMs in the most challenging settings is evaluated exclusively on AraSEG. Because the abstract itself notes that cross-genre generalization remains challenging, the representativeness of the eight selected genres plus punctuation conditions is load-bearing for any practical recommendation; no quantitative justification or external validation of genre coverage is provided.

Authors: We agree that all reported results are specific to AraSEG and that the abstract explicitly flags cross-genre generalization difficulties. AraSEG was constructed precisely to fill the absence of any prior genre-diverse Arabic sentence segmentation resource; the eight genres were chosen to span formal/informal registers, literary, journalistic, legal, religious, and social-media styles that exhibit systematically different punctuation reliability and document structures (detailed in §3). Performance variation across these genres in our experiments provides internal evidence of diversity. Because no comparable annotated Arabic corpora exist, quantitative external validation (e.g., genre-overlap metrics against other resources) is not feasible; the benchmark itself constitutes the first such testbed. The central claim is therefore scoped to “the most challenging settings represented in AraSEG” rather than universal superiority. We are happy to expand the genre-selection rationale and limitations discussion in a revision if the editor deems it useful. revision: partial

Circularity Check

0 steps flagged

Empirical benchmark study with new dataset; no circularity in performance claims or derivations

full rationale

The paper introduces the AraSEG corpus spanning eight genres and punctuation conditions, then reports direct experimental comparisons of LLMs, lightweight encoders, and dependency parsers on segmentation tasks under varying difficulty settings. The central performance claims (lightweight models outperforming LLMs in challenging conditions) are obtained by training and evaluating models on the newly collected data splits; no equations, fitted parameters, or self-citations are used to derive or redefine these metrics. The study contains no derivation chain that reduces outputs to inputs by construction, and the representativeness of the genres is presented as an empirical assumption rather than a self-referential result. This is a standard self-contained benchmark paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical NLP evaluation paper; relies on standard assumptions about benchmark validity rather than new theoretical constructs.

axioms (1)

domain assumption The collected genres and punctuation conditions in AraSEG constitute a valid proxy for real-world Arabic text variability.
Central to the claim that results generalize beyond the corpus.

pith-pipeline@v0.9.1-grok · 5685 in / 1144 out tokens · 17934 ms · 2026-06-27T19:49:49.643853+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 22 canonical work pages

[1]

Ahmed Abdelali, Kareem Darwish, Nadir Durrani, and Hamdy Mubarak. 2016. Farasa: A fast and furious segmenter for A rabic. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 11--16, San Diego, California

2016
[2]

Abbas Mahmoud Al-Akkad. 1938. https://www.hindawi.org/books/72707304/ Sarah . Hindawi

arXiv 1938
[3]

Imam Muhammad al Bukhari. 846. Sahih al-Bukhari. Dar Ibn Khathir
[4]

Bayan Al-Safadi. 2005. Al-Kashkoul: selection of poetry and prose for children ( \<الكشكول: مختارات من الشعر والنثر للأطفال> ) . Al-Sa'ih Library ( \<مكتبة السائح> )

2005
[5]

Mai Alammar, Khalil El Hindi, and Hend Al-Khalifa. 2025. https://doi.org/10.3390/computation13060151 English-arabic hybrid semantic text chunking based on fine-tuning bert . Computation, 13(6)

work page doi:10.3390/computation13060151 2025
[6]

A. Alfaifi. 2015. https://doi.org/10.13140/RG.2.2.32081.53608 Building the Arabic Learner Corpus and a System for Arabic Error Annotation . Ph.D. thesis, University of Leeds

work page doi:10.13140/rg.2.2.32081.53608 2015
[7]

Bashar Alhafni, Go Inoue, Christian Khairallah, and Nizar Habash. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.396 Advancements in A rabic grammatical error detection and correction: An empirical investigation . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6430--6448, Singapore. Association for Comp...

work page doi:10.18653/v1/2023.emnlp-main.396 2023
[8]

Alshanqiti, Sami Albouq, Ahmad B

Abdullah M. Alshanqiti, Sami Albouq, Ahmad B. Alkhodre, Abdallah Namoun, and Emad Nabil. 2022. https://doi.org/10.3390/app122010559 Employing a multilingual transformer model for segmenting unpunctuated arabic text . Applied Sciences, 12(20)

work page doi:10.3390/app122010559 2022
[9]

Shatha Altammami, Eric Atwell, and Ammar Alsalka. 2019. The arabic--english parallel corpus of authentic hadith. International Journal on Islamic Applications in Computer Science And Technology-IJASAT

2019
[10]

Touir Ameur, Mathkour Hassan, and Al-Sanea Waleed. 2008. https://doi.org/10.3923/itj.2008.1009.1015 Semantic-based segmentation of arabic texts . Information Technology Journal, 7

work page doi:10.3923/itj.2008.1009.1015 2008
[11]

Mohamed Anwar, Abdelhakim Freihat, George Ibrahim, Mostafa Awad, Abdelrahman Atef Mohamed Ali Sadallah, Gurpreet Gosal, Gokul Ramakrishnan, Sarath Chandran, Biswajit Mishra, Rituraj Joshi, Ahmed Frikha, Etienne Goffinet, Abhishek Maiti, Ali El Filali, Sarah Al Barri, Samujjwal Ghosh, Rahul Pal, Parvez Mullah, Awantika Shukla, and 41 others. 2025. Jais 2: ...

2025
[12]

Doug Beeferman, Adam Berger, and John Lafferty. 1999. https://doi.org/10.1023/A:1007506220214 Statistical models for text segmentation . Machine Learning, 34(1):177--210

work page doi:10.1023/a:1007506220214 1999
[13]

Guy Bilitski, Lev Shechter, Sonam Jamtsho, Nir Marciano, Nicola Bajetta, Rebecca Sunden, Omri Drori, Kai Golan Hashiloni, Orr Zwebner, Asaf Shina, Orna Almogi, Dorji Wangchuk, and Kfir Bar. 2026. https://doi.org/10.63317/2iyfjjv9boc6 Automatic segmentation of classical tibetan texts into autochthonous and allochthonous regions . In Proceedings of the Fift...

work page doi:10.63317/2iyfjjv9boc6 2026
[14]

Houda Bouamor, Nizar Habash, Mohammad Salameh, Wajdi Zaghouani, Owen Rambow, Dana Abdulrahim, Ossama Obeid, Salam Khalifa, Fadhl Eryani, Alexander Erdmann, and Kemal Oflazer. 2018. https://aclanthology.org/L18-1535/ The MADAR A rabic dialect corpus and lexicon . In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (...

2018
[15]

Kirill Chirkunov, Younes Samih, Abed Alhakim Freihat, and Hanan Aldarmaki. 2026. https://arxiv.org/abs/2605.06276 Linear semantic segmentation for low-resource spoken dialects . Preprint, arXiv:2605.06276

Pith/arXiv arXiv 2026
[16]

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, and 3416 others. 2025. https://arxiv.org/abs/2507.06261 Gemini 2.5: Pus...

Pith/arXiv arXiv 2025
[17]

Timothy Dozat and Christopher D. Manning. 2016. https://arxiv.org/abs/1611.01734 Deep biaffine attention for neural dependency parsing . CoRR, abs/1611.01734

Pith/arXiv arXiv 2016
[18]

Kais Dukes, Eric Atwell, and Nizar Habash. 2013. Supervised collaboration for syntactic annotation of quranic arabic. Language resources and evaluation, 47(1):33--62

2013
[19]

Matthias Eck and Chiori Hori. 2005. https://aclanthology.org/2005.iwslt-1.1/ Overview of the IWSLT 2005 evaluation campaign . In Proceedings of the Second International Workshop on Spoken Language Translation, Pittsburgh, Pennsylvania, USA

2005
[20]

Mo El-Haj and Saad Ezzini. 2024. https://aclanthology.org/2024.osact-1.7/ The multilingual corpus of world ' s constitutions ( MCWC ) . In Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024, pages 57--66, Torino,...

2024
[21]

Elmadani, Nizar Habash, and Hanada Taha-Thomure

Khalid N. Elmadani, Nizar Habash, and Hanada Taha-Thomure. 2025. https://doi.org/10.18653/v1/2025.findings-acl.842 A large and balanced corpus for fine-grained A rabic readability assessment . In Findings of the Association for Computational Linguistics: ACL 2025, pages 16376--16400, Vienna, Austria. Association for Computational Linguistics

work page doi:10.18653/v1/2025.findings-acl.842 2025
[22]

Ahmed Elshabrawy, Muhammed AbuOdeh, Go Inoue, and Nizar Habash. 2023. https://doi.org/10.18653/v1/2023.arabicnlp-1.15 C amel P arser2.0: A state-of-the-art dependency parser for A rabic . In Proceedings of ArabicNLP 2023, pages 170--180, Singapore (Hybrid). Association for Computational Linguistics

work page doi:10.18653/v1/2023.arabicnlp-1.15 2023
[23]

Markus Frohmann, Igor Sterner, Ivan Vuli \'c , Benjamin Minixhofer, and Markus Schedl. 2024. https://doi.org/10.18653/v1/2024.emnlp-main.665 Segment any text: A universal approach for robust, efficient and adaptable sentence segmentation . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 11908--11941, Miami,...

work page doi:10.18653/v1/2024.emnlp-main.665 2024
[24]

Nizar Habash, Muhammed AbuOdeh, Dima Taji, Reem Faraj, Jamila El Gizuli, and Omar Kallas. 2022. https://aclanthology.org/2022.lrec-1.286/ Camel treebank: An open multi-genre A rabic dependency treebank . In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2672--2681, Marseille, France. European Language Resources Association

2022
[25]

Nizar Habash and David Palfreyman. 2022. https://aclanthology.org/2022.lrec-1.9/ ZAEBUC : An annotated A rabic- E nglish bilingual writer corpus . In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 79--88, Marseille, France. European Language Resources Association

2022
[26]

Nizar Habash, Abdelhadi Soudi, and Tim Buckwalter. 2007. On A rabic Transliteration . In A. van den Bosch and A. Soudi, editors, A rabic Computational Morphology: Knowledge-based and Empirical Methods , pages 15--22. Springer, Netherlands

2007
[27]

Lamia Hadrich Belguith, Leila Baccour, and Mourad Ghassan. 2005. https://aclanthology.org/2005.jeptalnrecital-court.12/ Segmentation de textes arabes bas \'e e sur l ' analyse contextuelle des signes de ponctuations et de certaines particules . In Actes de la 12 \`e me conf \'e rence sur le Traitement Automatique des Langues Naturelles. Articles courts , ...

2005
[28]

Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. 2020. https://doi.org/10.5281/zenodo.1212303 spacy: Industrial-strength natural language processing in python

work page doi:10.5281/zenodo.1212303 2020
[29]

Go Inoue, Bashar Alhafni, Nurpeiis Baimukan, Houda Bouamor, and Nizar Habash. 2021. https://aclanthology.org/2021.wanlp-1.10/ The interplay of variant, size, and task type in A rabic pre-trained language models . In Proceedings of the Sixth Arabic Natural Language Processing Workshop, pages 92--104, Kyiv, Ukraine (Virtual). Association for Computational L...

2021
[30]

Iskandar Keskes, Farah Benamara, and Lamia Hadrich Belguith. 2012. https://aclanthology.org/L12-1559/ Clause-based discourse segmentation of A rabic texts . In Proceedings of the Eighth International Conference on Language Resources and Evaluation ( LREC '12) , pages 2826--2832, Istanbul, Turkey. European Language Resources Association (ELRA)

2012
[31]

Muhamed Al Khalil, Hind Saddiki, Nizar Habash, and Latifa Alfalasi. 2018. A Leveled Reading Corpus of Modern Standard A rabic . In Proceedings of the Language Resources and Evaluation Conference (LREC), Miyazaki, Japan

2018
[32]

Tibor Kiss and Jan Strunk. 2006. https://doi.org/10.1162/coli.2006.32.4.485 Unsupervised multilingual sentence boundary detection . Computational Linguistics, 32(4):485--525

work page doi:10.1162/coli.2006.32.4.485 2006
[33]

Fajri Koto, Haonan Li, Sara Shatnawi, Jad Doughman, Abdelrahman Sadallah, Aisha Alraeesi, Khalid Almubarak, Zaid Alyafeai, Neha Sengupta, Shady Shehata, Nizar Habash, Preslav Nakov, and Timothy Baldwin. 2024. https://doi.org/10.18653/v1/2024.findings-acl.334 A rabic MMLU : Assessing massive multitask language understanding in A rabic . In Findings of the ...

work page doi:10.18653/v1/2024.findings-acl.334 2024
[34]

Pierre Lison and J \"o rg Tiedemann. 2016. Open S ubtitles2016: E xtracting L arge P arallel C orpora from M ovie and TV S ubtitles. In Proceedings of the Language Resources and Evaluation Conference (LREC), Portoro z , Slovenia

2016
[35]

Zhichen Liu, Yongyuan Li, and Yang Xu. 2026. https://arxiv.org/abs/2604.10135 Think in sentences: Explicit sentence boundaries enhance language model's capabilities . Preprint, arXiv:2604.10135

Pith/arXiv arXiv 2026
[36]

Asma Mekki, In \`e s Zribi, Mariem Ellouze, and Lamia Hadrich Belguith. 2022. https://doi.org/10.1007/s10579-021-09538-4 Sentence boundary detection of various forms of tunisian arabic . Language Resources and Evaluation, 56(1):357--385

work page doi:10.1007/s10579-021-09538-4 2022
[37]

Benjamin Minixhofer, Jonas Pfeiffer, and Ivan Vuli \'c . 2023. https://doi.org/10.18653/v1/2023.acl-long.398 Where ' s the point? self-supervised multilingual punctuation-agnostic sentence segmentation . In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7215--7235, Toronto, Canada. As...

work page doi:10.18653/v1/2023.acl-long.398 2023
[38]

Behrang Mohit, Alla Rozovskaya, Nizar Habash, Wajdi Zaghouani, and Ossama Obeid. 2014. https://doi.org/10.3115/v1/W14-3605 The first QALB shared task on automatic text correction for A rabic . In Proceedings of the EMNLP 2014 Workshop on A rabic Natural Language Processing ( ANLP ) , pages 39--47, Doha, Qatar. Association for Computational Linguistics

work page doi:10.3115/v1/w14-3605 2014
[39]

Joakim Nivre and Chiao-Ting Fang. 2017. https://aclanthology.org/W17-0411/ U niversal D ependency evaluation . In Proceedings of the N o D a L i D a 2017 Workshop on Universal Dependencies ( UDW 2017) , pages 86--95, Gothenburg, Sweden. Association for Computational Linguistics

2017
[40]

Ossama Obeid, Nasser Zalmout, Salam Khalifa, Dima Taji, Mai Oudah, Bashar Alhafni, Go Inoue, Fadhl Eryani, Alexander Erdmann, and Nizar Habash. 2020. https://aclanthology.org/2020.lrec-1.868/ CAM e L tools: An open source python toolkit for A rabic natural language processing . In Proceedings of the Twelfth Language Resources and Evaluation Conference, pa...

2020
[41]

OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, and 262 others. 2024. https://arxiv.org/abs/2303.08774 Gpt-4 technical r...

Pith/arXiv arXiv 2024
[42]

Lev Pevzner and Marti A. Hearst. 2002. https://doi.org/10.1162/089120102317341756 A critique and improvement of an evaluation metric for text segmentation . Computational Linguistics, 28(1):19--36

work page doi:10.1162/089120102317341756 2002
[43]

Wenjie Qiu, Yi-Chen Li, Xuqin Zhang, Tianyi Zhang, Yihang Zhang, Zongzhang Zhang, and Yang Yu. 2025. https://arxiv.org/abs/2503.04793 Sentence-level reward model can generalize better for aligning llm from human preference . Preprint, arXiv:2503.04793

arXiv 2025
[44]

Fabian Retkowski and Alexander Waibel. 2026. https://doi.org/10.63317/3eczsids4mek Paragraph segmentation revisited: Towards a standard task for structuring speech . In Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026), pages 747--759, Palma, Mallorca, Spain. European Language Resources Association (ELRA)

work page doi:10.63317/3eczsids4mek 2026
[45]

Nipun Sadvilkar and Mark Neumann. 2020. https://doi.org/10.18653/v1/2020.nlposs-1.15 P y SBD : Pragmatic sentence boundary disambiguation . In Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS), pages 110--114, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.nlposs-1.15 2020
[46]

Aaditya Singh, Adam Fry, Adam Perelman, Adam Tart, Adi Ganesh, Ahmed El-Kishky, Aidan McLaughlin, Aiden Low, AJ Ostrow, Akhila Ananthram, Akshay Nathan, Alan Luo, Alec Helyar, Aleksander Madry, Aleksandr Efremov, Aleksandra Spyra, Alex Baker-Whitcomb, Alex Beutel, Alex Karpenko, and 467 others. 2026. https://arxiv.org/abs/2601.03267 Openai gpt-5 system ca...

Pith/arXiv arXiv 2026
[47]

Eli Smith and Cornelius Van Dyck. 1860. New Testament (Arabic Translation)
[48]

Eli Smith and Cornelius Van Dyck. 1865. Old Testament (Arabic Translation)
[49]

Hanada Taha-Thomure. 2007. Poems and News ( \<أشعار وأخبار> ) . Educational Book House ( \<دار الكتاب التربوي للنشر والتوزيع>)

2007
[50]

Toshiyuki Takezawa, Genichiro Kikui, Masahide Mizushima, and Eiichiro Sumita. 2007. https://aclanthology.org/O07-5005/ Multilingual spoken language corpus development for communication research . In International Journal of Computational Linguistics & C hinese Language Processing, Volume 12, Number 3, September 2007: Special Issue on Invited Papers from I...

2007
[51]

FANAR TEAM, Ummar Abbas, Mohammad Shahmeer Ahmad, Minhaj Ahmad, Abdulaziz Al-Homaid, Anas Al-Nuaimi, Enes Altinisik, Ehsaneddin Asgari, Sanjay Chawla, Shammur Chowdhury, Fahim Dalvi, Kareem Darwish, Nadir Durrani, Mohamed Elfeky, Ahmed Elmagarmid, Mohamed Eltabakh, Asim Ersoy, Masoomali Fatehkia, Mohammed Qusay Hashim, and 18 others. 2026. https://arxiv.o...

arXiv 2026
[52]

Ibn Tufail. 1150. https://www.hindawi.org/books/90463596/ Hayy ibn Yaqdhan . Hindawi

arXiv
[53]

12th century

Unknown. 12th century. One Thousand and One Nights
[54]

Rachel Wicks and Matt Post. 2021. https://doi.org/10.18653/v1/2021.acl-long.309 A unified approach to sentence segmentation of punctuated text in many languages . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pag...

work page doi:10.18653/v1/2021.acl-long.309 2021
[55]

Nan Xu and Xuezhe Ma. 2025. https://doi.org/10.18653/v1/2025.naacl-long.172 LLM the genius paradox: A linguistic and math expert ' s struggle with simple word-based counting problems . In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long P...

work page doi:10.18653/v1/2025.naacl-long.172 2025
[56]

Sane Yagi, Shehdeh Fareh, Ashraf Elnagar, Mariam Balajeed, Abdalla El-mneizel, and Mohammad Al-Badawi. 2024. https://doi.org/10.1080/23311983.2024.2303818 Is A rabic punctuation rule-governed? Cogent Arts & Humanities, 11(1):2303818

work page doi:10.1080/23311983.2024.2303818 2024
[57]

Wajdi Zaghouani and Dana Awad. 2016. Toward an A rabic punctuated corpus: Annotation guidelines and evaluation. In The 2nd Workshop on Arabic Corpora and Processing Tools 2016 Theme: Social Media, page 22

2016
[58]

Wajdi Zaghouani, Behrang Mohit, Nizar Habash, Ossama Obeid, Nadi Tomeh, Alla Rozovskaya, Noura Farra, Sarah Alkuhlani, and Kemal Oflazer. 2014. Large Scale A rabic Error Annotation: Guidelines and Framework . In Proceedings of the Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland

2014
[59]

Chujie Zheng, Shixuan Liu, Mingze Li, Xiong-Hui Chen, Bowen Yu, Chang Gao, Kai Dang, Yuqiong Liu, Rui Men, An Yang, Jingren Zhou, and Junyang Lin. 2025. https://arxiv.org/abs/2507.18071 Group sequence policy optimization . Preprint, arXiv:2507.18071

Pith/arXiv arXiv 2025

[1] [1]

Ahmed Abdelali, Kareem Darwish, Nadir Durrani, and Hamdy Mubarak. 2016. Farasa: A fast and furious segmenter for A rabic. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 11--16, San Diego, California

2016

[2] [2]

Abbas Mahmoud Al-Akkad. 1938. https://www.hindawi.org/books/72707304/ Sarah . Hindawi

arXiv 1938

[3] [3]

Imam Muhammad al Bukhari. 846. Sahih al-Bukhari. Dar Ibn Khathir

[4] [4]

Bayan Al-Safadi. 2005. Al-Kashkoul: selection of poetry and prose for children ( \<الكشكول: مختارات من الشعر والنثر للأطفال> ) . Al-Sa'ih Library ( \<مكتبة السائح> )

2005

[5] [5]

Mai Alammar, Khalil El Hindi, and Hend Al-Khalifa. 2025. https://doi.org/10.3390/computation13060151 English-arabic hybrid semantic text chunking based on fine-tuning bert . Computation, 13(6)

work page doi:10.3390/computation13060151 2025

[6] [6]

A. Alfaifi. 2015. https://doi.org/10.13140/RG.2.2.32081.53608 Building the Arabic Learner Corpus and a System for Arabic Error Annotation . Ph.D. thesis, University of Leeds

work page doi:10.13140/rg.2.2.32081.53608 2015

[7] [7]

Bashar Alhafni, Go Inoue, Christian Khairallah, and Nizar Habash. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.396 Advancements in A rabic grammatical error detection and correction: An empirical investigation . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6430--6448, Singapore. Association for Comp...

work page doi:10.18653/v1/2023.emnlp-main.396 2023

[8] [8]

Alshanqiti, Sami Albouq, Ahmad B

Abdullah M. Alshanqiti, Sami Albouq, Ahmad B. Alkhodre, Abdallah Namoun, and Emad Nabil. 2022. https://doi.org/10.3390/app122010559 Employing a multilingual transformer model for segmenting unpunctuated arabic text . Applied Sciences, 12(20)

work page doi:10.3390/app122010559 2022

[9] [9]

Shatha Altammami, Eric Atwell, and Ammar Alsalka. 2019. The arabic--english parallel corpus of authentic hadith. International Journal on Islamic Applications in Computer Science And Technology-IJASAT

2019

[10] [10]

Touir Ameur, Mathkour Hassan, and Al-Sanea Waleed. 2008. https://doi.org/10.3923/itj.2008.1009.1015 Semantic-based segmentation of arabic texts . Information Technology Journal, 7

work page doi:10.3923/itj.2008.1009.1015 2008

[11] [11]

Mohamed Anwar, Abdelhakim Freihat, George Ibrahim, Mostafa Awad, Abdelrahman Atef Mohamed Ali Sadallah, Gurpreet Gosal, Gokul Ramakrishnan, Sarath Chandran, Biswajit Mishra, Rituraj Joshi, Ahmed Frikha, Etienne Goffinet, Abhishek Maiti, Ali El Filali, Sarah Al Barri, Samujjwal Ghosh, Rahul Pal, Parvez Mullah, Awantika Shukla, and 41 others. 2025. Jais 2: ...

2025

[12] [12]

Doug Beeferman, Adam Berger, and John Lafferty. 1999. https://doi.org/10.1023/A:1007506220214 Statistical models for text segmentation . Machine Learning, 34(1):177--210

work page doi:10.1023/a:1007506220214 1999

[13] [13]

Guy Bilitski, Lev Shechter, Sonam Jamtsho, Nir Marciano, Nicola Bajetta, Rebecca Sunden, Omri Drori, Kai Golan Hashiloni, Orr Zwebner, Asaf Shina, Orna Almogi, Dorji Wangchuk, and Kfir Bar. 2026. https://doi.org/10.63317/2iyfjjv9boc6 Automatic segmentation of classical tibetan texts into autochthonous and allochthonous regions . In Proceedings of the Fift...

work page doi:10.63317/2iyfjjv9boc6 2026

[14] [14]

Houda Bouamor, Nizar Habash, Mohammad Salameh, Wajdi Zaghouani, Owen Rambow, Dana Abdulrahim, Ossama Obeid, Salam Khalifa, Fadhl Eryani, Alexander Erdmann, and Kemal Oflazer. 2018. https://aclanthology.org/L18-1535/ The MADAR A rabic dialect corpus and lexicon . In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (...

2018

[15] [15]

Kirill Chirkunov, Younes Samih, Abed Alhakim Freihat, and Hanan Aldarmaki. 2026. https://arxiv.org/abs/2605.06276 Linear semantic segmentation for low-resource spoken dialects . Preprint, arXiv:2605.06276

Pith/arXiv arXiv 2026

[16] [16]

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, and 3416 others. 2025. https://arxiv.org/abs/2507.06261 Gemini 2.5: Pus...

Pith/arXiv arXiv 2025

[17] [17]

Timothy Dozat and Christopher D. Manning. 2016. https://arxiv.org/abs/1611.01734 Deep biaffine attention for neural dependency parsing . CoRR, abs/1611.01734

Pith/arXiv arXiv 2016

[18] [18]

Kais Dukes, Eric Atwell, and Nizar Habash. 2013. Supervised collaboration for syntactic annotation of quranic arabic. Language resources and evaluation, 47(1):33--62

2013

[19] [19]

Matthias Eck and Chiori Hori. 2005. https://aclanthology.org/2005.iwslt-1.1/ Overview of the IWSLT 2005 evaluation campaign . In Proceedings of the Second International Workshop on Spoken Language Translation, Pittsburgh, Pennsylvania, USA

2005

[20] [20]

Mo El-Haj and Saad Ezzini. 2024. https://aclanthology.org/2024.osact-1.7/ The multilingual corpus of world ' s constitutions ( MCWC ) . In Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024, pages 57--66, Torino,...

2024

[21] [21]

Elmadani, Nizar Habash, and Hanada Taha-Thomure

Khalid N. Elmadani, Nizar Habash, and Hanada Taha-Thomure. 2025. https://doi.org/10.18653/v1/2025.findings-acl.842 A large and balanced corpus for fine-grained A rabic readability assessment . In Findings of the Association for Computational Linguistics: ACL 2025, pages 16376--16400, Vienna, Austria. Association for Computational Linguistics

work page doi:10.18653/v1/2025.findings-acl.842 2025

[22] [22]

Ahmed Elshabrawy, Muhammed AbuOdeh, Go Inoue, and Nizar Habash. 2023. https://doi.org/10.18653/v1/2023.arabicnlp-1.15 C amel P arser2.0: A state-of-the-art dependency parser for A rabic . In Proceedings of ArabicNLP 2023, pages 170--180, Singapore (Hybrid). Association for Computational Linguistics

work page doi:10.18653/v1/2023.arabicnlp-1.15 2023

[23] [23]

Markus Frohmann, Igor Sterner, Ivan Vuli \'c , Benjamin Minixhofer, and Markus Schedl. 2024. https://doi.org/10.18653/v1/2024.emnlp-main.665 Segment any text: A universal approach for robust, efficient and adaptable sentence segmentation . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 11908--11941, Miami,...

work page doi:10.18653/v1/2024.emnlp-main.665 2024

[24] [24]

Nizar Habash, Muhammed AbuOdeh, Dima Taji, Reem Faraj, Jamila El Gizuli, and Omar Kallas. 2022. https://aclanthology.org/2022.lrec-1.286/ Camel treebank: An open multi-genre A rabic dependency treebank . In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2672--2681, Marseille, France. European Language Resources Association

2022

[25] [25]

Nizar Habash and David Palfreyman. 2022. https://aclanthology.org/2022.lrec-1.9/ ZAEBUC : An annotated A rabic- E nglish bilingual writer corpus . In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 79--88, Marseille, France. European Language Resources Association

2022

[26] [26]

Nizar Habash, Abdelhadi Soudi, and Tim Buckwalter. 2007. On A rabic Transliteration . In A. van den Bosch and A. Soudi, editors, A rabic Computational Morphology: Knowledge-based and Empirical Methods , pages 15--22. Springer, Netherlands

2007

[27] [27]

Lamia Hadrich Belguith, Leila Baccour, and Mourad Ghassan. 2005. https://aclanthology.org/2005.jeptalnrecital-court.12/ Segmentation de textes arabes bas \'e e sur l ' analyse contextuelle des signes de ponctuations et de certaines particules . In Actes de la 12 \`e me conf \'e rence sur le Traitement Automatique des Langues Naturelles. Articles courts , ...

2005

[28] [28]

Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. 2020. https://doi.org/10.5281/zenodo.1212303 spacy: Industrial-strength natural language processing in python

work page doi:10.5281/zenodo.1212303 2020

[29] [29]

Go Inoue, Bashar Alhafni, Nurpeiis Baimukan, Houda Bouamor, and Nizar Habash. 2021. https://aclanthology.org/2021.wanlp-1.10/ The interplay of variant, size, and task type in A rabic pre-trained language models . In Proceedings of the Sixth Arabic Natural Language Processing Workshop, pages 92--104, Kyiv, Ukraine (Virtual). Association for Computational L...

2021

[30] [30]

Iskandar Keskes, Farah Benamara, and Lamia Hadrich Belguith. 2012. https://aclanthology.org/L12-1559/ Clause-based discourse segmentation of A rabic texts . In Proceedings of the Eighth International Conference on Language Resources and Evaluation ( LREC '12) , pages 2826--2832, Istanbul, Turkey. European Language Resources Association (ELRA)

2012

[31] [31]

Muhamed Al Khalil, Hind Saddiki, Nizar Habash, and Latifa Alfalasi. 2018. A Leveled Reading Corpus of Modern Standard A rabic . In Proceedings of the Language Resources and Evaluation Conference (LREC), Miyazaki, Japan

2018

[32] [32]

Tibor Kiss and Jan Strunk. 2006. https://doi.org/10.1162/coli.2006.32.4.485 Unsupervised multilingual sentence boundary detection . Computational Linguistics, 32(4):485--525

work page doi:10.1162/coli.2006.32.4.485 2006

[33] [33]

Fajri Koto, Haonan Li, Sara Shatnawi, Jad Doughman, Abdelrahman Sadallah, Aisha Alraeesi, Khalid Almubarak, Zaid Alyafeai, Neha Sengupta, Shady Shehata, Nizar Habash, Preslav Nakov, and Timothy Baldwin. 2024. https://doi.org/10.18653/v1/2024.findings-acl.334 A rabic MMLU : Assessing massive multitask language understanding in A rabic . In Findings of the ...

work page doi:10.18653/v1/2024.findings-acl.334 2024

[34] [34]

Pierre Lison and J \"o rg Tiedemann. 2016. Open S ubtitles2016: E xtracting L arge P arallel C orpora from M ovie and TV S ubtitles. In Proceedings of the Language Resources and Evaluation Conference (LREC), Portoro z , Slovenia

2016

[35] [35]

Zhichen Liu, Yongyuan Li, and Yang Xu. 2026. https://arxiv.org/abs/2604.10135 Think in sentences: Explicit sentence boundaries enhance language model's capabilities . Preprint, arXiv:2604.10135

Pith/arXiv arXiv 2026

[36] [36]

Asma Mekki, In \`e s Zribi, Mariem Ellouze, and Lamia Hadrich Belguith. 2022. https://doi.org/10.1007/s10579-021-09538-4 Sentence boundary detection of various forms of tunisian arabic . Language Resources and Evaluation, 56(1):357--385

work page doi:10.1007/s10579-021-09538-4 2022

[37] [37]

Benjamin Minixhofer, Jonas Pfeiffer, and Ivan Vuli \'c . 2023. https://doi.org/10.18653/v1/2023.acl-long.398 Where ' s the point? self-supervised multilingual punctuation-agnostic sentence segmentation . In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7215--7235, Toronto, Canada. As...

work page doi:10.18653/v1/2023.acl-long.398 2023

[38] [38]

Behrang Mohit, Alla Rozovskaya, Nizar Habash, Wajdi Zaghouani, and Ossama Obeid. 2014. https://doi.org/10.3115/v1/W14-3605 The first QALB shared task on automatic text correction for A rabic . In Proceedings of the EMNLP 2014 Workshop on A rabic Natural Language Processing ( ANLP ) , pages 39--47, Doha, Qatar. Association for Computational Linguistics

work page doi:10.3115/v1/w14-3605 2014

[39] [39]

Joakim Nivre and Chiao-Ting Fang. 2017. https://aclanthology.org/W17-0411/ U niversal D ependency evaluation . In Proceedings of the N o D a L i D a 2017 Workshop on Universal Dependencies ( UDW 2017) , pages 86--95, Gothenburg, Sweden. Association for Computational Linguistics

2017

[40] [40]

Ossama Obeid, Nasser Zalmout, Salam Khalifa, Dima Taji, Mai Oudah, Bashar Alhafni, Go Inoue, Fadhl Eryani, Alexander Erdmann, and Nizar Habash. 2020. https://aclanthology.org/2020.lrec-1.868/ CAM e L tools: An open source python toolkit for A rabic natural language processing . In Proceedings of the Twelfth Language Resources and Evaluation Conference, pa...

2020

[41] [41]

OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, and 262 others. 2024. https://arxiv.org/abs/2303.08774 Gpt-4 technical r...

Pith/arXiv arXiv 2024

[42] [42]

Lev Pevzner and Marti A. Hearst. 2002. https://doi.org/10.1162/089120102317341756 A critique and improvement of an evaluation metric for text segmentation . Computational Linguistics, 28(1):19--36

work page doi:10.1162/089120102317341756 2002

[43] [43]

Wenjie Qiu, Yi-Chen Li, Xuqin Zhang, Tianyi Zhang, Yihang Zhang, Zongzhang Zhang, and Yang Yu. 2025. https://arxiv.org/abs/2503.04793 Sentence-level reward model can generalize better for aligning llm from human preference . Preprint, arXiv:2503.04793

arXiv 2025

[44] [44]

Fabian Retkowski and Alexander Waibel. 2026. https://doi.org/10.63317/3eczsids4mek Paragraph segmentation revisited: Towards a standard task for structuring speech . In Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026), pages 747--759, Palma, Mallorca, Spain. European Language Resources Association (ELRA)

work page doi:10.63317/3eczsids4mek 2026

[45] [45]

Nipun Sadvilkar and Mark Neumann. 2020. https://doi.org/10.18653/v1/2020.nlposs-1.15 P y SBD : Pragmatic sentence boundary disambiguation . In Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS), pages 110--114, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.nlposs-1.15 2020

[46] [46]

Aaditya Singh, Adam Fry, Adam Perelman, Adam Tart, Adi Ganesh, Ahmed El-Kishky, Aidan McLaughlin, Aiden Low, AJ Ostrow, Akhila Ananthram, Akshay Nathan, Alan Luo, Alec Helyar, Aleksander Madry, Aleksandr Efremov, Aleksandra Spyra, Alex Baker-Whitcomb, Alex Beutel, Alex Karpenko, and 467 others. 2026. https://arxiv.org/abs/2601.03267 Openai gpt-5 system ca...

Pith/arXiv arXiv 2026

[47] [47]

Eli Smith and Cornelius Van Dyck. 1860. New Testament (Arabic Translation)

[48] [48]

Eli Smith and Cornelius Van Dyck. 1865. Old Testament (Arabic Translation)

[49] [49]

Hanada Taha-Thomure. 2007. Poems and News ( \<أشعار وأخبار> ) . Educational Book House ( \<دار الكتاب التربوي للنشر والتوزيع>)

2007

[50] [50]

Toshiyuki Takezawa, Genichiro Kikui, Masahide Mizushima, and Eiichiro Sumita. 2007. https://aclanthology.org/O07-5005/ Multilingual spoken language corpus development for communication research . In International Journal of Computational Linguistics & C hinese Language Processing, Volume 12, Number 3, September 2007: Special Issue on Invited Papers from I...

2007

[51] [51]

FANAR TEAM, Ummar Abbas, Mohammad Shahmeer Ahmad, Minhaj Ahmad, Abdulaziz Al-Homaid, Anas Al-Nuaimi, Enes Altinisik, Ehsaneddin Asgari, Sanjay Chawla, Shammur Chowdhury, Fahim Dalvi, Kareem Darwish, Nadir Durrani, Mohamed Elfeky, Ahmed Elmagarmid, Mohamed Eltabakh, Asim Ersoy, Masoomali Fatehkia, Mohammed Qusay Hashim, and 18 others. 2026. https://arxiv.o...

arXiv 2026

[52] [52]

Ibn Tufail. 1150. https://www.hindawi.org/books/90463596/ Hayy ibn Yaqdhan . Hindawi

arXiv

[53] [53]

12th century

Unknown. 12th century. One Thousand and One Nights

[54] [54]

Rachel Wicks and Matt Post. 2021. https://doi.org/10.18653/v1/2021.acl-long.309 A unified approach to sentence segmentation of punctuated text in many languages . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pag...

work page doi:10.18653/v1/2021.acl-long.309 2021

[55] [55]

Nan Xu and Xuezhe Ma. 2025. https://doi.org/10.18653/v1/2025.naacl-long.172 LLM the genius paradox: A linguistic and math expert ' s struggle with simple word-based counting problems . In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long P...

work page doi:10.18653/v1/2025.naacl-long.172 2025

[56] [56]

Sane Yagi, Shehdeh Fareh, Ashraf Elnagar, Mariam Balajeed, Abdalla El-mneizel, and Mohammad Al-Badawi. 2024. https://doi.org/10.1080/23311983.2024.2303818 Is A rabic punctuation rule-governed? Cogent Arts & Humanities, 11(1):2303818

work page doi:10.1080/23311983.2024.2303818 2024

[57] [57]

Wajdi Zaghouani and Dana Awad. 2016. Toward an A rabic punctuated corpus: Annotation guidelines and evaluation. In The 2nd Workshop on Arabic Corpora and Processing Tools 2016 Theme: Social Media, page 22

2016

[58] [58]

Wajdi Zaghouani, Behrang Mohit, Nizar Habash, Ossama Obeid, Nadi Tomeh, Alla Rozovskaya, Noura Farra, Sarah Alkuhlani, and Kemal Oflazer. 2014. Large Scale A rabic Error Annotation: Guidelines and Framework . In Proceedings of the Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland

2014

[59] [59]

Chujie Zheng, Shixuan Liu, Mingze Li, Xiong-Hui Chen, Bowen Yu, Chang Gao, Kai Dang, Yuqiong Liu, Rui Men, An Yang, Jingren Zhou, and Junyang Lin. 2025. https://arxiv.org/abs/2507.18071 Group sequence policy optimization . Preprint, arXiv:2507.18071

Pith/arXiv arXiv 2025