AraSEG is a genre-diverse Arabic sentence segmentation corpus showing lightweight encoders and dependency parsers outperform LLMs under challenging punctuation while improving downstream parsing.
Elmadani, Nizar Habash, and Hanada Taha-Thomure
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Structured prompting with lexical constraints lets LLMs produce Arabic text that closely matches target CEFR readability levels according to an automatic predictor.
citing papers explorer
-
Arabic Sentence Segmentation Across Genres and Punctuation Conditions
AraSEG is a genre-diverse Arabic sentence segmentation corpus showing lightweight encoders and dependency parsers outperform LLMs under challenging punctuation while improving downstream parsing.
-
Can LLMs Control Readability? A Multi-Dimensional Evaluation Framework for CEFR-Controlled Arabic Generation
Structured prompting with lexical constraints lets LLMs produce Arabic text that closely matches target CEFR readability levels according to an automatic predictor.