Alabi, Yanke Mao, Haonan Gao, and Annie En-Shiun Lee

David Ifeoluwa Adelani, Hannah Liu, Xiaoyu Shen, Nikita Vassilyev, Jesujoba O · 2023 · arXiv 2309.07445

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

cs.CL · 2025-07-17 · unverdicted · novelty 7.0

FLEXITOKENS replaces rigid subword tokenizers and fixed-compression auxiliary losses with a simplified boundary-prediction objective in byte-level models, yielding lower over-fragmentation and up to 10-point gains on multilingual and domain-adaptation tasks.

MOSAIC: A Multilingual, Taxonomy-Agnostic, and Computationally Efficient Approach for Radiological Report Classification

cs.CL · 2025-08-29 · unverdicted · novelty 6.0

MOSAIC achieves mean macro F1 of 88 on chest X-ray report classification across five datasets in four languages using a 4B-parameter open model with low GPU memory and few-shot or light fine-tuning options.

ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model

cs.CL · 2024-04-03 · unverdicted · novelty 4.0

Four MAFT-based PLMs for Angolan languages report 12.3-point gains over AfroXLMR-base and 3.8-point gains over OFA baselines on downstream tasks.

citing papers explorer

Showing 3 of 3 citing papers.

FLEXITOKENS: Flexible Tokenization for Evolving Language Models cs.CL · 2025-07-17 · unverdicted · none · ref 24
FLEXITOKENS replaces rigid subword tokenizers and fixed-compression auxiliary losses with a simplified boundary-prediction objective in byte-level models, yielding lower over-fragmentation and up to 10-point gains on multilingual and domain-adaptation tasks.
MOSAIC: A Multilingual, Taxonomy-Agnostic, and Computationally Efficient Approach for Radiological Report Classification cs.CL · 2025-08-29 · unverdicted · none · ref 1
MOSAIC achieves mean macro F1 of 88 on chest X-ray report classification across five datasets in four languages using a 4B-parameter open model with low GPU memory and few-shot or light fine-tuning options.
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model cs.CL · 2024-04-03 · unverdicted · none · ref 4
Four MAFT-based PLMs for Angolan languages report 12.3-point gains over AfroXLMR-base and 3.8-point gains over OFA baselines on downstream tasks.

Alabi, Yanke Mao, Haonan Gao, and Annie En-Shiun Lee

fields

years

verdicts

representative citing papers

citing papers explorer