Robust Multilingual Text-to-Pictogram Mapping for Scalable Reading Rehabilitation

Martina Galletti; Soufiane Jhilal

arxiv: 2603.24536 · v2 · submitted 2026-03-25 · 💻 cs.CL · cs.HC

Robust Multilingual Text-to-Pictogram Mapping for Scalable Reading Rehabilitation

Soufiane Jhilal , Martina Galletti This is my paper

Pith reviewed 2026-05-15 00:27 UTC · model grok-4.3

classification 💻 cs.CL cs.HC

keywords text-to-pictogram mappingmultilingual AIreading rehabilitationspecial educational needsvisual scaffoldingsemantic appropriatenessSEND supportneurodiverse learners

0 comments

The pith

An automated system maps text to pictograms across five languages with expert-rated semantic accuracy above 90 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an AI interface that identifies key concepts in text and replaces them with contextually relevant pictograms to help children with special educational needs understand reading material. This visual scaffolding is applied to English, French, Italian, Spanish, and Arabic texts to reduce reliance on constant one-on-one therapist support. Expert clinical reviews found the selected pictograms semantically appropriate or correct in over 95 percent of cases for the European languages and about 90 percent for Arabic, even with smaller image repositories. The system also meets latency requirements for real-time classroom use.

Core claim

The authors created a multilingual text-to-pictogram mapping system that dynamically identifies concepts and selects matching pictograms to provide visual scaffolding. Across five typologically diverse languages, coverage analysis, expert audits by speech therapists and special educators, and latency tests showed high pictogram density, combined correct and acceptable ratings above 95 percent for European languages and 90 percent for Arabic, and response times suitable for interactive educational applications.

What carries the argument

The dynamic concept identification and contextually relevant pictogram selection algorithm that maps text elements to images from a multilingual repository.

If this is right

The approach enables scaling of visual reading support without a matching increase in therapist hours.
Semantic appropriateness holds across languages with differing structures, including Arabic.
The system operates fast enough for live use in educational settings.
High visual scaffolding density can be achieved automatically in enhanced texts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If real-learner trials confirm benefits, the mapping could integrate into apps or classroom software for wider access.
Expanding the pictogram set for languages with lower coverage could raise accuracy further.
The same concept-mapping logic might extend to other supports such as simplified summaries or audio cues.

Load-bearing premise

That expert ratings of semantic appropriateness will correspond to actual gains in reading comprehension and engagement when children with special needs use the system.

What would settle it

A study that measures reading comprehension or engagement scores in children with SEND using the pictogram-enhanced texts versus plain texts and finds no measurable improvement.

read the original abstract

Reading comprehension presents a significant challenge for children with Special Educational Needs and Disabilities (SEND), often requiring intensive one-on-one reading support. To assist therapists in scaling this support, we developed a multilingual, AI-powered interface that automatically enhances text with visual scaffolding. This system dynamically identifies key concepts and maps them to contextually relevant pictograms, supporting learners across languages. We evaluated the system across five typologically diverse languages (English, French, Italian, Spanish, and Arabic), through multilingual coverage analysis, expert clinical review by speech therapists and special education professionals, and latency assessment. Evaluation results indicate high pictogram coverage and visual scaffolding density across the five languages. Expert audits suggested that automatically selected pictograms were semantically appropriate, with combined correct and acceptable ratings exceeding 95% for the four European languages and approximately 90% for Arabic despite reduced pictogram repository coverage. System latency remained within interactive thresholds suitable for real-time educational use. These findings support the technical viability, semantic safety, and acceptability of automated multimodal scaffolding to improve accessibility for neurodiverse learners.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The system maps text to pictograms across five languages with solid expert approval and low latency, but offers no evidence it actually helps SEND children read better.

read the letter

The core contribution is a working multilingual pipeline that pulls key concepts from text and assigns pictograms, tested on English, French, Italian, Spanish, and Arabic. Coverage stays high, latency stays interactive, and speech therapists plus special educators rate the outputs as correct or acceptable more than 95 percent of the time for the European languages and around 90 percent for Arabic even with thinner pictogram sets. That combination of language breadth and clinical sign-off is the genuinely new piece relative to earlier single-language tools. The implementation looks straightforward and reproducible from the reported metrics. The soft spot is the missing link to real outcomes. Expert ratings measure whether the chosen pictograms make sense to adults; they do not show whether children with SEND comprehend the enhanced text faster, stay engaged longer, or improve on any reading measure. No learner trials, no pre/post scores, no eye-tracking, and no baseline comparison appear. The abstract jumps from “experts like the pictures” to “supports scalable reading rehabilitation” without the data that would justify the second claim. Methodology details are also thin—no inter-rater stats, no error analysis by language or concept type. This paper is mainly useful to people already building assistive NLP systems who need a multilingual starting point and some clinical sanity checks. A reader working on actual SEND interventions will find the technical feasibility interesting but the effectiveness claim unsupported. It deserves peer review because the multilingual coverage and latency numbers are concrete and the expert validation is a step beyond pure automation papers, but referees will need to press for learner data and clearer evaluation design before any stronger claims can stand.

Referee Report

1 major / 2 minor

Summary. The paper presents a multilingual AI-powered system that automatically identifies key concepts in text and maps them to contextually relevant pictograms to provide visual scaffolding for reading rehabilitation in children with SEND. The system is evaluated on five typologically diverse languages (English, French, Italian, Spanish, Arabic) via coverage analysis, expert clinical review by speech therapists and special educators, and latency assessment. Reported results include high pictogram coverage, combined correct/acceptable semantic ratings exceeding 95% for the four European languages and ~90% for Arabic, and latency suitable for real-time use, supporting claims of technical viability and semantic safety.

Significance. If the results hold, the work offers a technically viable approach to scalable multimodal text enhancement across languages, addressing a real need in assistive technology for neurodiverse learners. The multilingual scope and low-latency design are practical strengths that could reduce reliance on one-on-one support; however, the significance for actual rehabilitation outcomes remains provisional given the evaluation design.

major comments (1)

[Evaluation section] Evaluation section (expert audit results): The central claim that the system supports scalable reading rehabilitation rests on expert ratings of semantic appropriateness (>95% combined correct/acceptable for European languages). These ratings measure rater agreement with algorithm output but do not include pre/post comprehension assessments, controlled learner trials, or outcome measures with actual SEND children, leaving the leap from expert approval to educational effectiveness unbridged and load-bearing for the rehabilitation claim.

minor comments (2)

[Abstract and Evaluation] The abstract and evaluation description would benefit from explicit reporting of inter-rater reliability metrics, exact number of raters per language, and how 'acceptable' ratings were defined and distinguished from 'correct'.
[Results] Figure or table presenting coverage statistics per language should include confidence intervals or variance measures to allow assessment of robustness.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback. We agree that the evaluation does not include direct learner outcome measures and will revise the manuscript to clarify the scope of our claims regarding rehabilitation support.

read point-by-point responses

Referee: [Evaluation section] Evaluation section (expert audit results): The central claim that the system supports scalable reading rehabilitation rests on expert ratings of semantic appropriateness (>95% combined correct/acceptable for European languages). These ratings measure rater agreement with algorithm output but do not include pre/post comprehension assessments, controlled learner trials, or outcome measures with actual SEND children, leaving the leap from expert approval to educational effectiveness unbridged and load-bearing for the rehabilitation claim.

Authors: We agree with the referee that our evaluation establishes technical viability, pictogram coverage, and expert-rated semantic appropriateness (via speech therapists and special educators) but does not include pre/post comprehension tests or controlled trials with SEND children. The study scope was limited to assessing the automated mapping system's reliability and safety as a scalable foundation, given ethical and logistical constraints on direct child trials at this stage. In the revised version we will update the abstract, introduction, and discussion sections to explicitly state that the results support feasibility and semantic safety of the multimodal scaffolding, while noting that claims of direct rehabilitation benefits require future empirical validation through learner outcome studies. This revision removes any overstatement and addresses the unbridged leap. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system evaluation without derivations or self-referential fits

full rationale

The paper reports the design and evaluation of a text-to-pictogram system across five languages, relying on coverage metrics, expert audits by speech therapists, and latency measurements. No equations, parameters, or derivations appear in the provided text. Central claims rest on independent expert ratings of semantic fit rather than any self-definition, fitted-input prediction, or self-citation chain that reduces the result to its own inputs. This is a standard empirical systems paper whose evidence is external to any internal modeling loop.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical derivations or theoretical constructs are introduced; the paper describes an applied system and its empirical evaluation.

pith-pipeline@v0.9.0 · 5480 in / 1044 out tokens · 36235 ms · 2026-05-15T00:27:36.916026+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

Learning to Read: What We Know and What We Need to Understand Better,

C. Hulme and M. J. Snowling, “Learning to Read: What We Know and What We Need to Understand Better,” Child Dev Perspect, vol. 7, no. 1, pp. 1–5, Mar. 2013, doi: 10.1111/cdep.12005

work page doi:10.1111/cdep.12005 2013
[2]

Artificial Intelligence for Enhancing Special Education for K-12: A Decade of Trends, Themes, and Global Insights (2013–2023),

Y. Yang, L. Chen, W. He, D. Sun, and S. Z. Salas-Pilco, “Artificial Intelligence for Enhancing Special Education for K-12: A Decade of Trends, Themes, and Global Insights (2013–2023),” Int J Artif Intell Educ, vol. 35, no. 3, pp. 1129–1177, Sep. 2025, doi: 10.1007/s40593-024-00422-0

work page doi:10.1007/s40593-024-00422-0 2013
[3]

Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability,

M. Galletti et al., “A Reading Comprehension Interface for Students with Learning Disorders,” in International Conference on Multimodal Interaction, Paris France: ACM, Oct. 2023, pp. 282–287. doi: 10.1145/3610661.3616176

work page doi:10.1145/3610661.3616176 2023
[4]

ARTIS: a digital interface to promote the rehabiliatation of text comprehension difficulties through Artificial Intelligence,

M. Galletti, E. Pasqua, M. Calanca, C. Marchesi, D. Tomaiuoli, and D. Nardi, “ARTIS: a digital interface to promote the rehabiliatation of text comprehension difficulties through Artificial Intelligence,” presented at the Ital-IA 2024: 4th National Conference on Artificial Intelligence,

work page 2024
[5]

Applications of Explainable Artificial Intelligence in Diagnosis and Surgery,

Y. Zhang, Y. Weng, and J. Lund, “Applications of Explainable Artificial Intelligence in Diagnosis and Surgery,” Diagnostics, vol. 12, no. 2, p. 237, Feb. 2022, doi: 10.3390/diagnostics12020237

work page doi:10.3390/diagnostics12020237 2022
[6]

Implantable Neural Speech Decoders: Recent Advances, Future Challenges,

S. Jhilal, S. Marchesotti, B. Thirion, B. Soudrie, A.-L. Giraud, and E. Mandonnet, “Implantable Neural Speech Decoders: Recent Advances, Future Challenges,” Neurorehabil Neural Repair, Sep. 2025, doi: 10.1177/15459683251369468

work page doi:10.1177/15459683251369468 2025
[7]

The Semantic Reader Project,

K. Lo et al., “The Semantic Reader Project,” Commun. ACM, vol. 67, no. 10, pp. 50–61, Sep. 2024, doi: 10.1145/3659096

work page doi:10.1145/3659096 2024
[8]

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols,

A. Head et al., “Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols,” in Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, in CHI ’21. New York, NY, USA: Association for Computing Machinery, May 2021, pp. 1–18

work page 2021
[9]

Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability,

T. Higasa, K. Tanaka, Q. Feng, and S. Morishima, “Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability,” in International Cconference on Multimodal Interaction, Oct. 2023, pp. 292–296. doi: 10.1145/3610661.3616177

work page doi:10.1145/3610661.3616177 2023
[10]

Web-based reading comprehension instruction: Three studies of 3D-readers,

M. C. Johnson-Glenberg, “Web-based reading comprehension instruction: Three studies of 3D-readers,” in Reading comprehension strategies: Theories, interventions, and technologies, Mahwah, NJ, US: Lawrence Erlbaum Associates Publishers, 2007, pp. 293–324

work page 2007
[11]

Keyword and context instruction of new vocabulary meanings: Effects on text comprehension and memory,

M. A. McDaniel and M. Pressley, “Keyword and context instruction of new vocabulary meanings: Effects on text comprehension and memory,” Journal of Educational Psychology, vol. 81, no. 2, pp. 204–213, 1989, doi: 10.1037/0022-0663.81.2.204

work page doi:10.1037/0022-0663.81.2.204 1989
[12]

Research on text comprehension in multimedia environments,

D. M. Chun and J. L. Plass, “Research on text comprehension in multimedia environments,” Jul. 1997, doi: 10.64152/10125/25004

work page doi:10.64152/10125/25004 1997
[13]

Semantic and Visuospatial Fluid Reasoning in School-Aged Autistic Children,

E. Danis, A.-M. Nader, J. Degré-Pelletier, and I. Soulières, “Semantic and Visuospatial Fluid Reasoning in School-Aged Autistic Children,” J Autism Dev Disord, vol. 53, no. 12, pp. 4719–4730, Dec. 2023, doi: 10.1007/s10803-022-05746-1

work page doi:10.1007/s10803-022-05746-1 2023
[14]

Agreement Attraction in

S. Jhilal, N. Molinaro, and A. Klimovich-Gray, “Non-verbal skills in auditory word processing: implications for typical and dyslexic readers,” Language, Cognition and Neuroscience, vol. 40, no. 3, pp. 341–359, Mar. 2025, doi: 10.1080/23273798.2024.2438012

work page doi:10.1080/23273798.2024.2438012 2025
[15]

Can children and adolescents with ADHD use attention to maintain verbal information in working memory?,

L. Superbia-Guimarães, M. Bader, and V. Camos, “Can children and adolescents with ADHD use attention to maintain verbal information in working memory?,” PLOS ONE, vol. 18, no. 3, p. e0282896, Mar. 2023, doi: 10.1371/journal.pone.0282896

work page doi:10.1371/journal.pone.0282896 2023
[16]

Improving Text-to-Pictograph Translation Through Word Sense Disambiguation,

L. Sevens, G. Jacobs, V. Vandeghinste, I. Schuurman, and F. Van Eynde, “Improving Text-to-Pictograph Translation Through Word Sense Disambiguation,” in Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, C. Gardent, R. Bernardi, and I. Titov, Eds., Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 131-135

work page 2016
[17]

Extending a Text-to-Pictograph System to French and to Arasaac,

M. Norré, V. Vandeghinste, P. Bouillon, and T. François, “Extending a Text-to-Pictograph System to French and to Arasaac,” in Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Sep. 2021, pp. 1050–1059

work page 2021
[18]

doi: 10.48550/arXiv.2603.28370

work page doi:10.48550/arxiv.2603.28370

[1] [1]

Learning to Read: What We Know and What We Need to Understand Better,

C. Hulme and M. J. Snowling, “Learning to Read: What We Know and What We Need to Understand Better,” Child Dev Perspect, vol. 7, no. 1, pp. 1–5, Mar. 2013, doi: 10.1111/cdep.12005

work page doi:10.1111/cdep.12005 2013

[2] [2]

Artificial Intelligence for Enhancing Special Education for K-12: A Decade of Trends, Themes, and Global Insights (2013–2023),

Y. Yang, L. Chen, W. He, D. Sun, and S. Z. Salas-Pilco, “Artificial Intelligence for Enhancing Special Education for K-12: A Decade of Trends, Themes, and Global Insights (2013–2023),” Int J Artif Intell Educ, vol. 35, no. 3, pp. 1129–1177, Sep. 2025, doi: 10.1007/s40593-024-00422-0

work page doi:10.1007/s40593-024-00422-0 2013

[3] [3]

Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability,

M. Galletti et al., “A Reading Comprehension Interface for Students with Learning Disorders,” in International Conference on Multimodal Interaction, Paris France: ACM, Oct. 2023, pp. 282–287. doi: 10.1145/3610661.3616176

work page doi:10.1145/3610661.3616176 2023

[4] [4]

ARTIS: a digital interface to promote the rehabiliatation of text comprehension difficulties through Artificial Intelligence,

M. Galletti, E. Pasqua, M. Calanca, C. Marchesi, D. Tomaiuoli, and D. Nardi, “ARTIS: a digital interface to promote the rehabiliatation of text comprehension difficulties through Artificial Intelligence,” presented at the Ital-IA 2024: 4th National Conference on Artificial Intelligence,

work page 2024

[5] [5]

Applications of Explainable Artificial Intelligence in Diagnosis and Surgery,

Y. Zhang, Y. Weng, and J. Lund, “Applications of Explainable Artificial Intelligence in Diagnosis and Surgery,” Diagnostics, vol. 12, no. 2, p. 237, Feb. 2022, doi: 10.3390/diagnostics12020237

work page doi:10.3390/diagnostics12020237 2022

[6] [6]

Implantable Neural Speech Decoders: Recent Advances, Future Challenges,

S. Jhilal, S. Marchesotti, B. Thirion, B. Soudrie, A.-L. Giraud, and E. Mandonnet, “Implantable Neural Speech Decoders: Recent Advances, Future Challenges,” Neurorehabil Neural Repair, Sep. 2025, doi: 10.1177/15459683251369468

work page doi:10.1177/15459683251369468 2025

[7] [7]

The Semantic Reader Project,

K. Lo et al., “The Semantic Reader Project,” Commun. ACM, vol. 67, no. 10, pp. 50–61, Sep. 2024, doi: 10.1145/3659096

work page doi:10.1145/3659096 2024

[8] [8]

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols,

A. Head et al., “Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols,” in Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, in CHI ’21. New York, NY, USA: Association for Computing Machinery, May 2021, pp. 1–18

work page 2021

[9] [9]

Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability,

T. Higasa, K. Tanaka, Q. Feng, and S. Morishima, “Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability,” in International Cconference on Multimodal Interaction, Oct. 2023, pp. 292–296. doi: 10.1145/3610661.3616177

work page doi:10.1145/3610661.3616177 2023

[10] [10]

Web-based reading comprehension instruction: Three studies of 3D-readers,

M. C. Johnson-Glenberg, “Web-based reading comprehension instruction: Three studies of 3D-readers,” in Reading comprehension strategies: Theories, interventions, and technologies, Mahwah, NJ, US: Lawrence Erlbaum Associates Publishers, 2007, pp. 293–324

work page 2007

[11] [11]

Keyword and context instruction of new vocabulary meanings: Effects on text comprehension and memory,

M. A. McDaniel and M. Pressley, “Keyword and context instruction of new vocabulary meanings: Effects on text comprehension and memory,” Journal of Educational Psychology, vol. 81, no. 2, pp. 204–213, 1989, doi: 10.1037/0022-0663.81.2.204

work page doi:10.1037/0022-0663.81.2.204 1989

[12] [12]

Research on text comprehension in multimedia environments,

D. M. Chun and J. L. Plass, “Research on text comprehension in multimedia environments,” Jul. 1997, doi: 10.64152/10125/25004

work page doi:10.64152/10125/25004 1997

[13] [13]

Semantic and Visuospatial Fluid Reasoning in School-Aged Autistic Children,

E. Danis, A.-M. Nader, J. Degré-Pelletier, and I. Soulières, “Semantic and Visuospatial Fluid Reasoning in School-Aged Autistic Children,” J Autism Dev Disord, vol. 53, no. 12, pp. 4719–4730, Dec. 2023, doi: 10.1007/s10803-022-05746-1

work page doi:10.1007/s10803-022-05746-1 2023

[14] [14]

Agreement Attraction in

S. Jhilal, N. Molinaro, and A. Klimovich-Gray, “Non-verbal skills in auditory word processing: implications for typical and dyslexic readers,” Language, Cognition and Neuroscience, vol. 40, no. 3, pp. 341–359, Mar. 2025, doi: 10.1080/23273798.2024.2438012

work page doi:10.1080/23273798.2024.2438012 2025

[15] [15]

Can children and adolescents with ADHD use attention to maintain verbal information in working memory?,

L. Superbia-Guimarães, M. Bader, and V. Camos, “Can children and adolescents with ADHD use attention to maintain verbal information in working memory?,” PLOS ONE, vol. 18, no. 3, p. e0282896, Mar. 2023, doi: 10.1371/journal.pone.0282896

work page doi:10.1371/journal.pone.0282896 2023

[16] [16]

Improving Text-to-Pictograph Translation Through Word Sense Disambiguation,

L. Sevens, G. Jacobs, V. Vandeghinste, I. Schuurman, and F. Van Eynde, “Improving Text-to-Pictograph Translation Through Word Sense Disambiguation,” in Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, C. Gardent, R. Bernardi, and I. Titov, Eds., Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 131-135

work page 2016

[17] [17]

Extending a Text-to-Pictograph System to French and to Arasaac,

M. Norré, V. Vandeghinste, P. Bouillon, and T. François, “Extending a Text-to-Pictograph System to French and to Arasaac,” in Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Sep. 2021, pp. 1050–1059

work page 2021

[18] [18]

doi: 10.48550/arXiv.2603.28370

work page doi:10.48550/arxiv.2603.28370